Search Results: "jan"

9 December 2023

Simon Josefsson: Classic McEliece goes to IETF and OpenSSH

My earlier work on Streamlined NTRU Prime has been progressing along. The IETF document on sntrup761 in SSH has passed several process points. GnuPG s libgcrypt has added support for sntrup761. The libssh support for sntrup761 is working, but the merge request is stuck mostly due to lack of time to debug why the regression test suite sporadically errors out in non-sntrup761 related parts with the patch. The foundation for lattice-based post-quantum algorithms has some uncertainty around it, and I have felt that there is more to the post-quantum story than adding sntrup761 to implementations. Classic McEliece has been mentioned to me a couple of times, and I took some time to learn it and did a cut n paste job of the proposed ISO standard and published draft-josefsson-mceliece in the IETF to make the algorithm easily available to the IETF community. A high-quality implementation of Classic McEliece has been published as libmceliece and I ve been supporting the work of Jan Moj to package libmceliece for Debian, alas it has been stuck in the ftp-master NEW queue for manual review for over two months. The pre-dependencies librandombytes and libcpucycles are available in Debian already. All that text writing and packaging work set the scene to write some code. When I added support for sntrup761 in libssh, I became familiar with the OpenSSH code base, so it was natural to return to OpenSSH to experiment with a new SSH KEX for Classic McEliece. DJB suggested to pick mceliece6688128 and combine it with the existing X25519+sntrup761 or with plain X25519. While a three-algorithm hybrid between X25519, sntrup761 and mceliece6688128 would be a simple drop-in for those that don t want to lose the benefits offered by sntrup761, I decided to start the journey on a pure combination of X25519 with mceliece6688128. The key combiner in sntrup761x25519 is a simple SHA512 call and the only good I can say about that is that it is simple to describe and implement, and doesn t raise too many questions since it is already deployed. After procrastinating coding for months, once I sat down to work it only took a couple of hours until I had a successful Classic McEliece SSH connection. I suppose my brain had sorted everything in background before I started. To reproduce it, please try the following in a Debian testing environment (I use podman to get a clean environment).
# podman run -it --rm debian:testing-slim
apt update
apt dist-upgrade -y
apt install -y wget python3 librandombytes-dev libcpucycles-dev gcc make git autoconf libz-dev libssl-dev
cd ~
wget -q -O- https://lib.mceliece.org/libmceliece-20230612.tar.gz   tar xfz -
cd libmceliece-20230612/
./configure
make install
ldconfig
cd ..
git clone https://gitlab.com/jas/openssh-portable
cd openssh-portable
git checkout jas/mceliece
autoreconf
./configure # verify 'libmceliece support: yes'
make # CC="cc -DDEBUG_KEX=1 -DDEBUG_KEXDH=1 -DDEBUG_KEXECDH=1"
You should now have a working SSH client and server that supports Classic McEliece! Verify support by running ./ssh -Q kex and it should mention mceliece6688128x25519-sha512@openssh.com. To have it print plenty of debug outputs, you may remove the # character on the final line, but don t use such a build in production. You can test it as follows:
./ssh-keygen -A # writes to /usr/local/etc/ssh_host_...
# setup public-key based login by running the following:
./ssh-keygen -t rsa -f ~/.ssh/id_rsa -P ""
cat ~/.ssh/id_rsa.pub > ~/.ssh/authorized_keys
adduser --system sshd
mkdir /var/empty
while true; do $PWD/sshd -p 2222 -f /dev/null; done &
./ssh -v -p 2222 localhost -oKexAlgorithms=mceliece6688128x25519-sha512@openssh.com date
On the client you should see output like this:
OpenSSH_9.5p1, OpenSSL 3.0.11 19 Sep 2023
...
debug1: SSH2_MSG_KEXINIT sent
debug1: SSH2_MSG_KEXINIT received
debug1: kex: algorithm: mceliece6688128x25519-sha512@openssh.com
debug1: kex: host key algorithm: ssh-ed25519
debug1: kex: server->client cipher: chacha20-poly1305@openssh.com MAC: <implicit> compression: none
debug1: kex: client->server cipher: chacha20-poly1305@openssh.com MAC: <implicit> compression: none
debug1: expecting SSH2_MSG_KEX_ECDH_REPLY
debug1: SSH2_MSG_KEX_ECDH_REPLY received
debug1: Server host key: ssh-ed25519 SHA256:YognhWY7+399J+/V8eAQWmM3UFDLT0dkmoj3pIJ0zXs
...
debug1: Host '[localhost]:2222' is known and matches the ED25519 host key.
debug1: Found key in /root/.ssh/known_hosts:1
debug1: rekey out after 134217728 blocks
debug1: SSH2_MSG_NEWKEYS sent
debug1: expecting SSH2_MSG_NEWKEYS
debug1: SSH2_MSG_NEWKEYS received
debug1: rekey in after 134217728 blocks
...
debug1: Sending command: date
debug1: pledge: fork
debug1: permanently_set_uid: 0/0
Environment:
  USER=root
  LOGNAME=root
  HOME=/root
  PATH=/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin
  MAIL=/var/mail/root
  SHELL=/bin/bash
  SSH_CLIENT=::1 46894 2222
  SSH_CONNECTION=::1 46894 ::1 2222
debug1: client_input_channel_req: channel 0 rtype exit-status reply 0
debug1: client_input_channel_req: channel 0 rtype eow@openssh.com reply 0
Sat Dec  9 22:22:40 UTC 2023
debug1: channel 0: free: client-session, nchannels 1
Transferred: sent 1048044, received 3500 bytes, in 0.0 seconds
Bytes per second: sent 23388935.4, received 78108.6
debug1: Exit status 0
Notice the kex: algorithm: mceliece6688128x25519-sha512@openssh.com output. How about network bandwidth usage? Below is a comparison of a complete SSH client connection such as the one above that log in and print date and logs out. Plain X25519 is around 7kb, X25519 with sntrup761 is around 9kb, and mceliece6688128 with X25519 is around 1MB. Yes, Classic McEliece has large keys, but for many environments, 1MB of data for the session establishment will barely be noticeable.
./ssh -v -p 2222 localhost -oKexAlgorithms=curve25519-sha256 date 2>&1   grep ^Transferred
Transferred: sent 3028, received 3612 bytes, in 0.0 seconds
./ssh -v -p 2222 localhost -oKexAlgorithms=sntrup761x25519-sha512@openssh.com date 2>&1   grep ^Transferred
Transferred: sent 4212, received 4596 bytes, in 0.0 seconds
./ssh -v -p 2222 localhost -oKexAlgorithms=mceliece6688128x25519-sha512@openssh.com date 2>&1   grep ^Transferred
Transferred: sent 1048044, received 3764 bytes, in 0.0 seconds
So how about session establishment time?
date; i=0; while test $i -le 100; do ./ssh -v -p 2222 localhost -oKexAlgorithms=curve25519-sha256 date > /dev/null 2>&1; i= expr $i + 1 ; done; date
Sat Dec  9 22:39:19 UTC 2023
Sat Dec  9 22:39:25 UTC 2023
# 6 seconds
date; i=0; while test $i -le 100; do ./ssh -v -p 2222 localhost -oKexAlgorithms=sntrup761x25519-sha512@openssh.com date > /dev/null 2>&1; i= expr $i + 1 ; done; date
Sat Dec  9 22:39:29 UTC 2023
Sat Dec  9 22:39:38 UTC 2023
# 9 seconds
date; i=0; while test $i -le 100; do ./ssh -v -p 2222 localhost -oKexAlgorithms=mceliece6688128x25519-sha512@openssh.com date > /dev/null 2>&1; i= expr $i + 1 ; done; date
Sat Dec  9 22:39:55 UTC 2023
Sat Dec  9 22:40:07 UTC 2023
# 12 seconds
I never noticed adding sntrup761, so I m pretty sure I wouldn t notice this increase either. This is all running on my laptop that runs Trisquel so take it with a grain of salt but at least the magnitude is clear. Future work items include: Happy post-quantum SSH ing! Update: Changing the mceliece6688128_keypair call to mceliece6688128f_keypair (i.e., using the fully compatible f-variant) results in McEliece being just as fast as sntrup761 on my machine. Update 2023-12-26: An initial IETF document draft-josefsson-ssh-mceliece-00 published.

6 December 2023

Reproducible Builds: Reproducible Builds in November 2023

Welcome to the November 2023 report from the Reproducible Builds project! In these reports we outline the most important things that we have been up to over the past month. As a rather rapid recap, whilst anyone may inspect the source code of free software for malicious flaws, almost all software is distributed to end users as pre-compiled binaries (more).

Reproducible Builds Summit 2023 Between October 31st and November 2nd, we held our seventh Reproducible Builds Summit in Hamburg, Germany! Amazingly, the agenda and all notes from all sessions are all online many thanks to everyone who wrote notes from the sessions. As a followup on one idea, started at the summit, Alexander Couzens and Holger Levsen started work on a cache (or tailored front-end) for the snapshot.debian.org service. The general idea is that, when rebuilding Debian, you do not actually need the whole ~140TB of data from snapshot.debian.org; rather, only a very small subset of the packages are ever used for for building. It turns out, for amd64, arm64, armhf, i386, ppc64el, riscv64 and s390 for Debian trixie, unstable and experimental, this is only around 500GB ie. less than 1%. Although the new service not yet ready for usage, it has already provided a promising outlook in this regard. More information is available on https://rebuilder-snapshot.debian.net and we hope that this service becomes usable in the coming weeks. The adjacent picture shows a sticky note authored by Jan-Benedict Glaw at the summit in Hamburg, confirming Holger Levsen s theory that rebuilding all Debian packages needs a very small subset of packages, the text states that 69,200 packages (in Debian sid) list 24,850 packages in their .buildinfo files, in 8,0200 variations. This little piece of paper was the beginning of rebuilder-snapshot and is a direct outcome of the summit! The Reproducible Builds team would like to thank our event sponsors who include Mullvad VPN, openSUSE, Debian, Software Freedom Conservancy, Allotropia and Aspiration Tech.

Beyond Trusting FOSS presentation at SeaGL On November 4th, Vagrant Cascadian presented Beyond Trusting FOSS at SeaGL in Seattle, WA in the United States. Founded in 2013, SeaGL is a free, grassroots technical summit dedicated to spreading awareness and knowledge about free source software, hardware and culture. The summary of Vagrant s talk mentions that it will:
[ ] introduce the concepts of Reproducible Builds, including best practices for developing and releasing software, the tools available to help diagnose issues, and touch on progress towards solving decades-old deeply pervasive fundamental security issues Learn how to verify and demonstrate trust, rather than simply hoping everything is OK!
Germane to the contents of the talk, the slides for Vagrant s talk can be built reproducibly, resulting in a PDF with a SHA1 of cfde2f8a0b7e6ec9b85377eeac0661d728b70f34 when built on Debian bookworm and c21fab273232c550ce822c4b0d9988e6c49aa2c3 on Debian sid at the time of writing.

Human Factors in Software Supply Chain Security Marcel Fourn , Dominik Wermke, Sascha Fahl and Yasemin Acar have published an article in a Special Issue of the IEEE s Security & Privacy magazine. Entitled A Viewpoint on Human Factors in Software Supply Chain Security: A Research Agenda, the paper justifies the need for reproducible builds to reach developers and end-users specifically, and furthermore points out some under-researched topics that we have seen mentioned in interviews. An author pre-print of the article is available in PDF form.

Community updates On our mailing list this month:

openSUSE updates Bernhard M. Wiedemann has created a wiki page outlining an proposal to create a general-purpose Linux distribution which consists of 100% bit-reproducible packages albeit minus the embedded signature within RPM files. It would be based on openSUSE Tumbleweed or, if available, its Slowroll-variant. In addition, Bernhard posted another monthly update for his work elsewhere in openSUSE.

Ubuntu Launchpad now supports .buildinfo files Back in 2017, Steve Langasek filed a bug against Ubuntu s Launchpad code hosting platform to report that .changes files (artifacts of building Ubuntu and Debian packages) reference .buildinfo files that aren t actually exposed by Launchpad itself. This was causing issues when attempting to process .changes files with tools such as Lintian. However, it was noticed last month that, in early August of this year, Simon Quigley had resolved this issue, and .buildinfo files are now available from the Launchpad system.

PHP reproducibility updates There have been two updates from the PHP programming language this month. Firstly, the widely-deployed PHPUnit framework for the PHP programming language have recently released version 10.5.0, which introduces the inclusion of a composer.lock file, ensuring total reproducibility of the shipped binary file. Further details and the discussion that went into their particular implementation can be found on the associated GitHub pull request. In addition, the presentation Leveraging Nix in the PHP ecosystem has been given in late October at the PHP International Conference in Munich by Pol Dellaiera. While the video replay is not yet available, the (reproducible) presentation slides and speaker notes are available.

diffoscope changes diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made a number of changes, including:
  • Improving DOS/MBR extraction by adding support for 7z. [ ]
  • Adding a missing RequiredToolNotFound import. [ ]
  • As a UI/UX improvement, try and avoid printing an extended traceback if diffoscope runs out of memory. [ ]
  • Mark diffoscope as stable on PyPI.org. [ ]
  • Uploading version 252 to Debian unstable. [ ]

Website updates A huge number of notes were added to our website that were taken at our recent Reproducible Builds Summit held between October 31st and November 2nd in Hamburg, Germany. In particular, a big thanks to Arnout Engelen, Bernhard M. Wiedemann, Daan De Meyer, Evangelos Ribeiro Tzaras, Holger Levsen and Orhun Parmaks z. In addition to this, a number of other changes were made, including:

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework (available at tests.reproducible-builds.org) in order to check packages and other artifacts for reproducibility. In October, a number of changes were made by Holger Levsen:
  • Debian-related changes:
    • Track packages marked as Priority: important in a new package set. [ ][ ]
    • Stop scheduling packages that fail to build from source in bookworm [ ] and bullseye. [ ].
    • Add old releases dashboard link in web navigation. [ ]
    • Permit re-run of the pool_buildinfos script to be re-run for a specific year. [ ]
    • Grant jbglaw access to the osuosl4 node [ ][ ] along with lynxis [ ].
    • Increase RAM on the amd64 Ionos builders from 48 GiB to 64 GiB; thanks IONOS! [ ]
    • Move buster to archived suites. [ ][ ]
    • Reduce the number of arm64 architecture workers from 24 to 16 in order to improve stability [ ], reduce the workers for amd64 from 32 to 28 and, for i386, reduce from 12 down to 8 [ ].
    • Show the entire build history of each Debian package. [ ]
    • Stop scheduling already tested package/version combinations in Debian bookworm. [ ]
  • Snapshot service for rebuilders
    • Add an HTTP-based API endpoint. [ ][ ]
    • Add a Gunicorn instance to serve the HTTP API. [ ]
    • Add an NGINX config [ ][ ][ ][ ]
  • System-health:
    • Detect failures due to HTTP 503 Service Unavailable errors. [ ]
    • Detect failures to update package sets. [ ]
    • Detect unmet dependencies. (This usually occurs with builds of Debian live-build.) [ ]
  • Misc-related changes:
    • do install systemd-ommd on jenkins. [ ]
    • fix harmless typo in squid.conf for codethink04. [ ]
    • fixup: reproducible Debian: add gunicorn service to serve /api for rebuilder-snapshot.d.o. [ ]
    • Increase codethink04 s Squid cache_dir size setting to 16 GiB. [ ]
    • Don t install systemd-oomd as it unfortunately kills sshd [ ]
    • Use debootstrap from backports when commisioning nodes. [ ]
    • Add the live_build_debian_stretch_gnome, debsums-tests_buster and debsums-tests_buster jobs to the zombie list. [ ][ ]
    • Run jekyll build with the --watch argument when building the Reproducible Builds website. [ ]
    • Misc node maintenance. [ ][ ][ ]
Other changes were made as well, however, including Mattia Rizzolo fixing rc.local s Bash syntax so it can actually run [ ], commenting away some file cleanup code that is (potentially) deleting too much [ ] and fixing the html_brekages page for Debian package builds [ ]. Finally, diagnosed and submitted a patch to add a AddEncoding gzip .gz line to the tests.reproducible-builds.org Apache configuration so that Gzip files aren t re-compressed as Gzip which some clients can t deal with (as well as being a waste of time). [ ]

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

3 December 2023

Iustin Pop: Life, getting sick and unfit

Like clockwork, like every autumn, got sick again. Just a flu, actually probably two in a row. I don t really understand it - from January to September I m feeling really awesome, and I manage to do sports five days a week, or more. Then September comes, and things start degrading, and then October or November, I get sick and it takes me ~3 weeks to recover, during which I m not even managing reliably 5K steps a day (from walking), not even talking about running or swimming or biking. My Garmin statistics for November compared to October (which wasn t a good month either) are depressing. Let s not even bring up for example July I can see three potential pathways: Basically all three pathways are similar, just not sure what s the root cause (vs. a trigger). I just did a D test, and despite taking regularly supplements, it was below normal range. So clearly somewhere there, lack of vitamin D is a problem. I probably should do a yearly check-up at end of September? Sigh, I still haven t found the detailed user manual for Homo Sapiens. If anyone has it, specifically for version late 1900 s, I d be thankful. Till next time, hopefully by then things will get better.

26 November 2023

Dirk Eddelbuettel: RQuantLib 0.4.20 on CRAN: More Maintenance

A new release 0.4.20 of RQuantLib arrived at CRAN earlier today, and has already been uploaded to Debian as well. QuantLib is a rather comprehensice free/open-source library for quantitative finance. RQuantLib connects (some parts of) it to the R environment and language, and has been part of CRAN for more than twenty years (!!) as it was one of the first packages I uploaded there. This release of RQuantLib brings a few more updates for nags triggered by recent changes in the upcoming R release (aka r-devel , usually due in April). The Rd parser now identifies curly braces that lack a preceding macro, usually a typo as it was here which affected three files. The printf (or alike) format checker found two more small issues. The run-time checker for examples was unhappy with the callable bond example so we only run it in interactive mode now. Lastly I had alread commented-out the setting for a C++14 compilation (required by the remaining Boost headers) as C++14 has been the default since R 4.2.0 (with suitable compilers, at least). Those who need it explicitly will have to uncomment the line in src/Makevars.in. Lastly, the expand printf format strings also found a need for a small change in Rcpp so the development version (now 1.0.11.5) has that addressed; the change will be part of Rcpp 1.0.12 in January.

Changes in RQuantLib version 0.4.20 (2023-11-26)
  • Correct three help pages with stray curly braces
  • Correct two printf format strings
  • Comment-out explicit selection of C++14
  • Wrap one example inside 'if (interactive())' to not exceed total running time limit at CRAN checks

Courtesy of my CRANberries, there is also a diffstat report for the this release 0.4.20. As always, more detailed information is on the RQuantLib page. Questions, comments etc should go to the rquantlib-devel mailing list. Issue tickets can be filed at the GitHub repo. If you like this or other open-source work I do, you can now sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

21 November 2023

Mike Hommey: How I (kind of) killed Mercurial at Mozilla

Did you hear the news? Firefox development is moving from Mercurial to Git. While the decision is far from being mine, and I was barely involved in the small incremental changes that ultimately led to this decision, I feel I have to take at least some responsibility. And if you are one of those who would rather use Mercurial than Git, you may direct all your ire at me. But let's take a step back and review the past 25 years leading to this decision. You'll forgive me for skipping some details and any possible inaccuracies. This is already a long post, while I could have been more thorough, even I think that would have been too much. This is also not an official Mozilla position, only my personal perception and recollection as someone who was involved at times, but mostly an observer from a distance. From CVS to DVCS From its release in 1998, the Mozilla source code was kept in a CVS repository. If you're too young to know what CVS is, let's just say it's an old school version control system, with its set of problems. Back then, it was mostly ubiquitous in the Open Source world, as far as I remember. In the early 2000s, the Subversion version control system gained some traction, solving some of the problems that came with CVS. Incidentally, Subversion was created by Jim Blandy, who now works at Mozilla on completely unrelated matters. In the same period, the Linux kernel development moved from CVS to Bitkeeper, which was more suitable to the distributed nature of the Linux community. BitKeeper had its own problem, though: it was the opposite of Open Source, but for most pragmatic people, it wasn't a real concern because free access was provided. Until it became a problem: someone at OSDL developed an alternative client to BitKeeper, and licenses of BitKeeper were rescinded for OSDL members, including Linus Torvalds (they were even prohibited from purchasing one). Following this fiasco, in April 2005, two weeks from each other, both Git and Mercurial were born. The former was created by Linus Torvalds himself, while the latter was developed by Olivia Mackall, who was a Linux kernel developer back then. And because they both came out of the same community for the same needs, and the same shared experience with BitKeeper, they both were similar distributed version control systems. Interestingly enough, several other DVCSes existed: In this landscape, the major difference Git was making at the time was that it was blazing fast. Almost incredibly so, at least on Linux systems. That was less true on other platforms (especially Windows). It was a game-changer for handling large codebases in a smooth manner. Anyways, two years later, in 2007, Mozilla decided to move its source code not to Bzr, not to Git, not to Subversion (which, yes, was a contender), but to Mercurial. The decision "process" was laid down in two rather colorful blog posts. My memory is a bit fuzzy, but I don't recall that it was a particularly controversial choice. All of those DVCSes were still young, and there was no definite "winner" yet (GitHub hadn't even been founded). It made the most sense for Mozilla back then, mainly because the Git experience on Windows still wasn't there, and that mattered a lot for Mozilla, with its diverse platform support. As a contributor, I didn't think much of it, although to be fair, at the time, I was mostly consuming the source tarballs. Personal preferences Digging through my archives, I've unearthed a forgotten chapter: I did end up setting up both a Mercurial and a Git mirror of the Firefox source repository on alioth.debian.org. Alioth.debian.org was a FusionForge-based collaboration system for Debian developers, similar to SourceForge. It was the ancestor of salsa.debian.org. I used those mirrors for the Debian packaging of Firefox (cough cough Iceweasel). The Git mirror was created with hg-fast-export, and the Mercurial mirror was only a necessary step in the process. By that time, I had converted my Subversion repositories to Git, and switched off SVK. Incidentally, I started contributing to Git around that time as well. I apparently did this not too long after Mozilla switched to Mercurial. As a Linux user, I think I just wanted the speed that Mercurial was not providing. Not that Mercurial was that slow, but the difference between a couple seconds and a couple hundred milliseconds was a significant enough difference in user experience for me to prefer Git (and Firefox was not the only thing I was using version control for) Other people had also similarly created their own mirror, or with other tools. But none of them were "compatible": their commit hashes were different. Hg-git, used by the latter, was putting extra information in commit messages that would make the conversion differ, and hg-fast-export would just not be consistent with itself! My mirror is long gone, and those have not been updated in more than a decade. I did end up using Mercurial, when I got commit access to the Firefox source repository in April 2010. I still kept using Git for my Debian activities, but I now was also using Mercurial to push to the Mozilla servers. I joined Mozilla as a contractor a few months after that, and kept using Mercurial for a while, but as a, by then, long time Git user, it never really clicked for me. It turns out, the sentiment was shared by several at Mozilla. Git incursion In the early 2010s, GitHub was becoming ubiquitous, and the Git mindshare was getting large. Multiple projects at Mozilla were already entirely hosted on GitHub. As for the Firefox source code base, Mozilla back then was kind of a Wild West, and engineers being engineers, multiple people had been using Git, with their own inconvenient workflows involving a local Mercurial clone. The most popular set of scripts was moz-git-tools, to incorporate changes in a local Git repository into the local Mercurial copy, to then send to Mozilla servers. In terms of the number of people doing that, though, I don't think it was a lot of people, probably a few handfuls. On my end, I was still keeping up with Mercurial. I think at that time several engineers had their own unofficial Git mirrors on GitHub, and later on Ehsan Akhgari provided another mirror, with a twist: it also contained the full CVS history, which the canonical Mercurial repository didn't have. This was particularly interesting for engineers who needed to do some code archeology and couldn't get past the 2007 cutoff of the Mercurial repository. I think that mirror ultimately became the official-looking, but really unofficial, mozilla-central repository on GitHub. On a side note, a Mercurial repository containing the CVS history was also later set up, but that didn't lead to something officially supported on the Mercurial side. Some time around 2011~2012, I started to more seriously consider using Git for work myself, but wasn't satisfied with the workflows others had set up for themselves. I really didn't like the idea of wasting extra disk space keeping a Mercurial clone around while using a Git mirror. I wrote a Python script that would use Mercurial as a library to access a remote repository and produce a git-fast-import stream. That would allow the creation of a git repository without a local Mercurial clone. It worked quite well, but it was not able to incrementally update. Other, more complete tools existed already, some of which I mentioned above. But as time was passing and the size and depth of the Mercurial repository was growing, these tools were showing their limits and were too slow for my taste, especially for the initial clone. Boot to Git In the same time frame, Mozilla ventured in the Mobile OS sphere with Boot to Gecko, later known as Firefox OS. What does that have to do with version control? The needs of third party collaborators in the mobile space led to the creation of what is now the gecko-dev repository on GitHub. As I remember it, it was challenging to create, but once it was there, Git users could just clone it and have a working, up-to-date local copy of the Firefox source code and its history... which they could already have, but this was the first officially supported way of doing so. Coincidentally, Ehsan's unofficial mirror was having trouble (to the point of GitHub closing the repository) and was ultimately shut down in December 2013. You'll often find comments on the interwebs about how GitHub has become unreliable since the Microsoft acquisition. I can't really comment on that, but if you think GitHub is unreliable now, rest assured that it was worse in its beginning. And its sustainability as a platform also wasn't a given, being a rather new player. So on top of having this official mirror on GitHub, Mozilla also ventured in setting up its own Git server for greater control and reliability. But the canonical repository was still the Mercurial one, and while Git users now had a supported mirror to pull from, they still had to somehow interact with Mercurial repositories, most notably for the Try server. Git slowly creeping in Firefox build tooling Still in the same time frame, tooling around building Firefox was improving drastically. For obvious reasons, when version control integration was needed in the tooling, Mercurial support was always a no-brainer. The first explicit acknowledgement of a Git repository for the Firefox source code, other than the addition of the .gitignore file, was bug 774109. It added a script to install the prerequisites to build Firefox on macOS (still called OSX back then), and that would print a message inviting people to obtain a copy of the source code with either Mercurial or Git. That was a precursor to current bootstrap.py, from September 2012. Following that, as far as I can tell, the first real incursion of Git in the Firefox source tree tooling happened in bug 965120. A few days earlier, bug 952379 had added a mach clang-format command that would apply clang-format-diff to the output from hg diff. Obviously, running hg diff on a Git working tree didn't work, and bug 965120 was filed, and support for Git was added there. That was in January 2014. A year later, when the initial implementation of mach artifact was added (which ultimately led to artifact builds), Git users were an immediate thought. But while they were considered, it was not to support them, but to avoid actively breaking their workflows. Git support for mach artifact was eventually added 14 months later, in March 2016. From gecko-dev to git-cinnabar Let's step back a little here, back to the end of 2014. My user experience with Mercurial had reached a level of dissatisfaction that was enough for me to decide to take that script from a couple years prior and make it work for incremental updates. That meant finding a way to store enough information locally to be able to reconstruct whatever the incremental updates would be relying on (guess why other tools hid a local Mercurial clone under hood). I got something working rather quickly, and after talking to a few people about this side project at the Mozilla Portland All Hands and seeing their excitement, I published a git-remote-hg initial prototype on the last day of the All Hands. Within weeks, the prototype gained the ability to directly push to Mercurial repositories, and a couple months later, was renamed to git-cinnabar. At that point, as a Git user, instead of cloning the gecko-dev repository from GitHub and switching to a local Mercurial repository whenever you needed to push to a Mercurial repository (i.e. the aforementioned Try server, or, at the time, for reviews), you could just clone and push directly from/to Mercurial, all within Git. And it was fast too. You could get a full clone of mozilla-central in less than half an hour, when at the time, other similar tools would take more than 10 hours (needless to say, it's even worse now). Another couple months later (we're now at the end of April 2015), git-cinnabar became able to start off a local clone of the gecko-dev repository, rather than clone from scratch, which could be time consuming. But because git-cinnabar and the tool that was updating gecko-dev weren't producing the same commits, this setup was cumbersome and not really recommended. For instance, if you pushed something to mozilla-central with git-cinnabar from a gecko-dev clone, it would come back with a different commit hash in gecko-dev, and you'd have to deal with the divergence. Eventually, in April 2020, the scripts updating gecko-dev were switched to git-cinnabar, making the use of gecko-dev alongside git-cinnabar a more viable option. Ironically(?), the switch occurred to ease collaboration with KaiOS (you know, the mobile OS born from the ashes of Firefox OS). Well, okay, in all honesty, when the need of syncing in both directions between Git and Mercurial (we only had ever synced from Mercurial to Git) came up, I nudged Mozilla in the direction of git-cinnabar, which, in my (biased but still honest) opinion, was the more reliable option for two-way synchronization (we did have regular conversion problems with hg-git, nothing of the sort has happened since the switch). One Firefox repository to rule them all For reasons I don't know, Mozilla decided to use separate Mercurial repositories as "branches". With the switch to the rapid release process in 2011, that meant one repository for nightly (mozilla-central), one for aurora, one for beta, and one for release. And with the addition of Extended Support Releases in 2012, we now add a new ESR repository every year. Boot to Gecko also had its own branches, and so did Fennec (Firefox for Mobile, before Android). There are a lot of them. And then there are also integration branches, where developer's work lands before being merged in mozilla-central (or backed out if it breaks things), always leaving mozilla-central in a (hopefully) good state. Only one of them remains in use today, though. I can only suppose that the way Mercurial branches work was not deemed practical. It is worth noting, though, that Mercurial branches are used in some cases, to branch off a dot-release when the next major release process has already started, so it's not a matter of not knowing the feature exists or some such. In 2016, Gregory Szorc set up a new repository that would contain them all (or at least most of them), which eventually became what is now the mozilla-unified repository. This would e.g. simplify switching between branches when necessary. 7 years later, for some reason, the other "branches" still exist, but most developers are expected to be using mozilla-unified. Mozilla's CI also switched to using mozilla-unified as base repository. Honestly, I'm not sure why the separate repositories are still the main entry point for pushes, rather than going directly to mozilla-unified, but it probably comes down to switching being work, and not being a top priority. Also, it probably doesn't help that working with multiple heads in Mercurial, even (especially?) with bookmarks, can be a source of confusion. To give an example, if you aren't careful, and do a plain clone of the mozilla-unified repository, you may not end up on the latest mozilla-central changeset, but rather, e.g. one from beta, or some other branch, depending which one was last updated. Hosting is simple, right? Put your repository on a server, install hgweb or gitweb, and that's it? Maybe that works for... Mercurial itself, but that repository "only" has slightly over 50k changesets and less than 4k files. Mozilla-central has more than an order of magnitude more changesets (close to 700k) and two orders of magnitude more files (more than 700k if you count the deleted or moved files, 350k if you count the currently existing ones). And remember, there are a lot of "duplicates" of this repository. And I didn't even mention user repositories and project branches. Sure, it's a self-inflicted pain, and you'd think it could probably(?) be mitigated with shared repositories. But consider the simple case of two repositories: mozilla-central and autoland. You make autoland use mozilla-central as a shared repository. Now, you push something new to autoland, it's stored in the autoland datastore. Eventually, you merge to mozilla-central. Congratulations, it's now in both datastores, and you'd need to clean-up autoland if you wanted to avoid the duplication. Now, you'd think mozilla-unified would solve these issues, and it would... to some extent. Because that wouldn't cover user repositories and project branches briefly mentioned above, which in GitHub parlance would be considered as Forks. So you'd want a mega global datastore shared by all repositories, and repositories would need to only expose what they really contain. Does Mercurial support that? I don't think so (okay, I'll give you that: even if it doesn't, it could, but that's extra work). And since we're talking about a transition to Git, does Git support that? You may have read about how you can link to a commit from a fork and make-pretend that it comes from the main repository on GitHub? At least, it shows a warning, now. That's essentially the architectural reason why. So the actual answer is that Git doesn't support it out of the box, but GitHub has some backend magic to handle it somehow (and hopefully, other things like Gitea, Girocco, Gitlab, etc. have something similar). Now, to come back to the size of the repository. A repository is not a static file. It's a server with which you negotiate what you have against what it has that you want. Then the server bundles what you asked for based on what you said you have. Or in the opposite direction, you negotiate what you have that it doesn't, you send it, and the server incorporates what you sent it. Fortunately the latter is less frequent and requires authentication. But the former is more frequent and CPU intensive. Especially when pulling a large number of changesets, which, incidentally, cloning is. "But there is a solution for clones" you might say, which is true. That's clonebundles, which offload the CPU intensive part of cloning to a single job scheduled regularly. Guess who implemented it? Mozilla. But that only covers the cloning part. We actually had laid the ground to support offloading large incremental updates and split clones, but that never materialized. Even with all that, that still leaves you with a server that can display file contents, diffs, blames, provide zip archives of a revision, and more, all of which are CPU intensive in their own way. And these endpoints are regularly abused, and cause extra load to your servers, yes plural, because of course a single server won't handle the load for the number of users of your big repositories. And because your endpoints are abused, you have to close some of them. And I'm not mentioning the Try repository with its tens of thousands of heads, which brings its own sets of problems (and it would have even more heads if we didn't fake-merge them once in a while). Of course, all the above applies to Git (and it only gained support for something akin to clonebundles last year). So, when the Firefox OS project was stopped, there wasn't much motivation to continue supporting our own Git server, Mercurial still being the official point of entry, and git.mozilla.org was shut down in 2016. The growing difficulty of maintaining the status quo Slowly, but steadily in more recent years, as new tooling was added that needed some input from the source code manager, support for Git was more and more consistently added. But at the same time, as people left for other endeavors and weren't necessarily replaced, or more recently with layoffs, resources allocated to such tooling have been spread thin. Meanwhile, the repository growth didn't take a break, and the Try repository was becoming an increasing pain, with push times quite often exceeding 10 minutes. The ongoing work to move Try pushes to Lando will hide the problem under the rug, but the underlying problem will still exist (although the last version of Mercurial seems to have improved things). On the flip side, more and more people have been relying on Git for Firefox development, to my own surprise, as I didn't really push for that to happen. It just happened organically, by ways of git-cinnabar existing, providing a compelling experience to those who prefer Git, and, I guess, word of mouth. I was genuinely surprised when I recently heard the use of Git among moz-phab users had surpassed a third. I did, however, occasionally orient people who struggled with Mercurial and said they were more familiar with Git, towards git-cinnabar. I suspect there's a somewhat large number of people who never realized Git was a viable option. But that, on its own, can come with its own challenges: if you use git-cinnabar without being backed by gecko-dev, you'll have a hard time sharing your branches on GitHub, because you can't push to a fork of gecko-dev without pushing your entire local repository, as they have different commit histories. And switching to gecko-dev when you weren't already using it requires some extra work to rebase all your local branches from the old commit history to the new one. Clone times with git-cinnabar have also started to go a little out of hand in the past few years, but this was mitigated in a similar manner as with the Mercurial cloning problem: with static files that are refreshed regularly. Ironically, that made cloning with git-cinnabar faster than cloning with Mercurial. But generating those static files is increasingly time-consuming. As of writing, generating those for mozilla-unified takes close to 7 hours. I was predicting clone times over 10 hours "in 5 years" in a post from 4 years ago, I wasn't too far off. With exponential growth, it could still happen, although to be fair, CPUs have improved since. I will explore the performance aspect in a subsequent blog post, alongside the upcoming release of git-cinnabar 0.7.0-b1. I don't even want to check how long it now takes with hg-git or git-remote-hg (they were already taking more than a day when git-cinnabar was taking a couple hours). I suppose it's about time that I clarify that git-cinnabar has always been a side-project. It hasn't been part of my duties at Mozilla, and the extent to which Mozilla supports git-cinnabar is in the form of taskcluster workers on the community instance for both git-cinnabar CI and generating those clone bundles. Consequently, that makes the above git-cinnabar specific issues a Me problem, rather than a Mozilla problem. Taking the leap I can't talk for the people who made the proposal to move to Git, nor for the people who put a green light on it. But I can at least give my perspective. Developers have regularly asked why Mozilla was still using Mercurial, but I think it was the first time that a formal proposal was laid out. And it came from the Engineering Workflow team, responsible for issue tracking, code reviews, source control, build and more. It's easy to say "Mozilla should have chosen Git in the first place", but back in 2007, GitHub wasn't there, Bitbucket wasn't there, and all the available options were rather new (especially compared to the then 21 years-old CVS). I think Mozilla made the right choice, all things considered. Had they waited a couple years, the story might have been different. You might say that Mozilla stayed with Mercurial for so long because of the sunk cost fallacy. I don't think that's true either. But after the biggest Mercurial repository hosting service turned off Mercurial support, and the main contributor to Mercurial going their own way, it's hard to ignore that the landscape has evolved. And the problems that we regularly encounter with the Mercurial servers are not going to get any better as the repository continues to grow. As far as I know, all the Mercurial repositories bigger than Mozilla's are... not using Mercurial. Google has its own closed-source server, and Facebook has another of its own, and it's not really public either. With resources spread thin, I don't expect Mozilla to be able to continue supporting a Mercurial server indefinitely (although I guess Octobus could be contracted to give a hand, but is that sustainable?). Mozilla, being a champion of Open Source, also doesn't live in a silo. At some point, you have to meet your contributors where they are. And the Open Source world is now majoritarily using Git. I'm sure the vast majority of new hires at Mozilla in the past, say, 5 years, know Git and have had to learn Mercurial (although they arguably didn't need to). Even within Mozilla, with thousands(!) of repositories on GitHub, Firefox is now actually the exception rather than the norm. I should even actually say Desktop Firefox, because even Mobile Firefox lives on GitHub (although Fenix is moving back in together with Desktop Firefox, and the timing is such that that will probably happen before Firefox moves to Git). Heck, even Microsoft moved to Git! With a significant developer base already using Git thanks to git-cinnabar, and all the constraints and problems I mentioned previously, it actually seems natural that a transition (finally) happens. However, had git-cinnabar or something similarly viable not existed, I don't think Mozilla would be in a position to take this decision. On one hand, it probably wouldn't be in the current situation of having to support both Git and Mercurial in the tooling around Firefox, nor the resource constraints related to that. But on the other hand, it would be farther from supporting Git and being able to make the switch in order to address all the other problems. But... GitHub? I hope I made a compelling case that hosting is not as simple as it can seem, at the scale of the Firefox repository. It's also not Mozilla's main focus. Mozilla has enough on its plate with the migration of existing infrastructure that does rely on Mercurial to understandably not want to figure out the hosting part, especially with limited resources, and with the mixed experience hosting both Mercurial and git has been so far. After all, GitHub couldn't even display things like the contributors' graph on gecko-dev until recently, and hosting is literally their job! They still drop the ball on large blames (thankfully we have searchfox for those). Where does that leave us? Gitlab? For those criticizing GitHub for being proprietary, that's probably not open enough. Cloud Source Repositories? "But GitHub is Microsoft" is a complaint I've read a lot after the announcement. Do you think Google hosting would have appealed to these people? Bitbucket? I'm kind of surprised it wasn't in the list of providers that were considered, but I'm also kind of glad it wasn't (and I'll leave it at that). I think the only relatively big hosting provider that could have made the people criticizing the choice of GitHub happy is Codeberg, but I hadn't even heard of it before it was mentioned in response to Mozilla's announcement. But really, with literal thousands of Mozilla repositories already on GitHub, with literal tens of millions repositories on the platform overall, the pragmatic in me can't deny that it's an attractive option (and I can't stress enough that I wasn't remotely close to the room where the discussion about what choice to make happened). "But it's a slippery slope". I can see that being a real concern. LLVM also moved its repository to GitHub (from a (I think) self-hosted Subversion server), and ended up moving off Bugzilla and Phabricator to GitHub issues and PRs four years later. As an occasional contributor to LLVM, I hate this move. I hate the GitHub review UI with a passion. At least, right now, GitHub PRs are not a viable option for Mozilla, for their lack of support for security related PRs, and the more general shortcomings in the review UI. That doesn't mean things won't change in the future, but let's not get too far ahead of ourselves. The move to Git has just been announced, and the migration has not even begun yet. Just because Mozilla is moving the Firefox repository to GitHub doesn't mean it's locked in forever or that all the eggs are going to be thrown into one basket. If bridges need to be crossed in the future, we'll see then. So, what's next? The official announcement said we're not expecting the migration to really begin until six months from now. I'll swim against the current here, and say this: the earlier you can switch to git, the earlier you'll find out what works and what doesn't work for you, whether you already know Git or not. While there is not one unique workflow, here's what I would recommend anyone who wants to take the leap off Mercurial right now: As there is no one-size-fits-all workflow, I won't tell you how to organize yourself from there. I'll just say this: if you know the Mercurial sha1s of your previous local work, you can create branches for them with:
$ git branch <branch_name> $(git cinnabar hg2git <hg_sha1>)
At this point, you should have everything available on the Git side, and you can remove the .hg directory. Or move it into some empty directory somewhere else, just in case. But don't leave it here, it will only confuse the tooling. Artifact builds WILL be confused, though, and you'll have to ./mach configure before being able to do anything. You may also hit bug 1865299 if your working tree is older than this post. If you have any problem or question, you can ping me on #git-cinnabar or #git on Matrix. I'll put the instructions above somewhere on wiki.mozilla.org, and we can collaboratively iterate on them. Now, what the announcement didn't say is that the Git repository WILL NOT be gecko-dev, doesn't exist yet, and WON'T BE COMPATIBLE (trust me, it'll be for the better). Why did I make you do all the above, you ask? Because that won't be a problem. I'll have you covered, I promise. The upcoming release of git-cinnabar 0.7.0-b1 will have a way to smoothly switch between gecko-dev and the future repository (incidentally, that will also allow to switch from a pure git-cinnabar clone to a gecko-dev one, for the git-cinnabar users who have kept reading this far). What about git-cinnabar? With Mercurial going the way of the dodo at Mozilla, my own need for git-cinnabar will vanish. Legitimately, this begs the question whether it will still be maintained. I can't answer for sure. I don't have a crystal ball. However, the needs of the transition itself will motivate me to finish some long-standing things (like finalizing the support for pushing merges, which is currently behind an experimental flag) or implement some missing features (support for creating Mercurial branches). Git-cinnabar started as a Python script, it grew a sidekick implemented in C, which then incorporated some Rust, which then cannibalized the Python script and took its place. It is now close to 90% Rust, and 10% C (if you don't count the code from Git that is statically linked to it), and has sort of become my Rust playground (it's also, I must admit, a mess, because of its history, but it's getting better). So the day to day use with Mercurial is not my sole motivation to keep developing it. If it were, it would stay stagnant, because all the features I need are there, and the speed is not all that bad, although I know it could be better. Arguably, though, git-cinnabar has been relatively stagnant feature-wise, because all the features I need are there. So, no, I don't expect git-cinnabar to die along Mercurial use at Mozilla, but I can't really promise anything either. Final words That was a long post. But there was a lot of ground to cover. And I still skipped over a bunch of things. I hope I didn't bore you to death. If I did and you're still reading... what's wrong with you? ;) So this is the end of Mercurial at Mozilla. So long, and thanks for all the fish. But this is also the beginning of a transition that is not easy, and that will not be without hiccups, I'm sure. So fasten your seatbelts (plural), and welcome the change. To circle back to the clickbait title, did I really kill Mercurial at Mozilla? Of course not. But it's like I stumbled upon a few sparks and tossed a can of gasoline on them. I didn't start the fire, but I sure made it into a proper bonfire... and now it has turned into a wildfire. And who knows? 15 years from now, someone else might be looking back at how Mozilla picked Git at the wrong time, and that, had we waited a little longer, we would have picked some yet to come new horse. But hey, that's the tech cycle for you.

25 October 2023

Russ Allbery: Review: Going Infinite

Review: Going Infinite, by Michael Lewis
Publisher: W.W. Norton & Company
Copyright: 2023
ISBN: 1-324-07434-5
Format: Kindle
Pages: 255
My first reaction when I heard that Michael Lewis had been embedded with Sam Bankman-Fried working on a book when Bankman-Fried's cryptocurrency exchange FTX collapsed into bankruptcy after losing billions of dollars of customer deposits was "holy shit, why would you talk to Michael Lewis about your dodgy cryptocurrency company?" Followed immediately by "I have to read this book." This is that book. I wasn't sure how Lewis would approach this topic. His normal (although not exclusive) area of interest is financial systems and crises, and there is lots of room for multiple books about cryptocurrency fiascoes using someone like Bankman-Fried as a pivot. But Going Infinite is not like The Big Short or Lewis's other financial industry books. It's a nearly straight biography of Sam Bankman-Fried, with just enough context for the reader to follow his life. To understand what you're getting in Going Infinite, I think it's important to understand what sort of book Lewis likes to write. Lewis is not exactly a reporter, although he does explain complicated things for a mass audience. He's primarily a storyteller who collects people he finds fascinating. This book was therefore never going to be like, say, Carreyrou's Bad Blood or Isaac's Super Pumped. Lewis's interest is not in a forensic account of how FTX or Alameda Research were structured. His interest is in what makes Sam Bankman-Fried tick, what's going on inside his head. That's not a question Lewis directly answers, though. Instead, he shows you Bankman-Fried as Lewis saw him and was able to reconstruct from interviews and sources and lets you draw your own conclusions. Boy did I ever draw a lot of conclusions, most of which were highly unflattering. However, one conclusion I didn't draw, and had been dubious about even before reading this book, was that Sam Bankman-Fried was some sort of criminal mastermind who intentionally plotted to steal customer money. Lewis clearly doesn't believe this is the case, and with the caveat that my study of the evidence outside of this book has been spotty and intermittent, I think Lewis has the better of the argument. I am utterly fascinated by this, and I'm afraid this review is going to turn into a long summary of my take on the argument, so here's the capsule review before you get bored and wander off: This is a highly entertaining book written by an excellent storyteller. I am also inclined to believe most of it is true, but given that I'm not on the jury, I'm not that invested in whether Lewis is too credulous towards the explanations of the people involved. What I do know is that it's a fantastic yarn with characters who are too wild to put in fiction, and I thoroughly enjoyed it. There are a few things that everyone involved appears to agree on, and therefore I think we can take as settled. One is that Bankman-Fried, and most of the rest of FTX and Alameda Research, never clearly distinguished between customer money and all of the other money. It's not obvious that their home-grown accounting software (written entirely by one person! who never spoke to other people! in code that no one else could understand!) was even capable of clearly delineating between their piles of money. Another is that FTX and Alameda Research were thoroughly intermingled. There was no official reporting structure and possibly not even a coherent list of employees. The environment was so chaotic that lots of people, including Bankman-Fried, could have stolen millions of dollars without anyone noticing. But it was also so chaotic that they could, and did, literally misplace millions of dollars by accident, or because Bankman-Fried had problems with object permanence. Something that was previously less obvious from news coverage but that comes through very clearly in this book is that Bankman-Fried seriously struggled with normal interpersonal and societal interactions. We know from multiple sources that he was diagnosed with ADHD and depression (Lewis describes it specifically as anhedonia, the inability to feel pleasure). The ADHD in Lewis's account is quite severe and does not sound controlled, despite medication; for example, Bankman-Fried routinely played timed video games while he was having important meetings, forgot things the moment he stopped dealing with them, was constantly on his phone or seeking out some other distraction, and often stimmed (by bouncing his leg) to a degree that other people found it distracting. Perhaps more tellingly, Bankman-Fried repeatedly describes himself in diary entries and correspondence to other people (particularly Caroline Ellison, his employee and on-and-off secret girlfriend) as being devoid of empathy and unable to access his own emotions, which Lewis supports with stories from former co-workers. I'm very hesitant to diagnose someone via a book, but, at least in Lewis's account, Bankman-Fried nearly walks down the symptom list of antisocial personality disorder in his own description of himself to other people. (The one exception is around physical violence; there is nothing in this book or in any of the other reporting that I've seen to indicate that Bankman-Fried was violent or physically abusive.) One of the recurrent themes of this book is that Bankman-Fried never saw the point in following rules that didn't make sense to him or worrying about things he thought weren't important, and therefore simply didn't. By about a third of the way into this book, before FTX is even properly started, very little about its eventual downfall will seem that surprising. There was no way that Sam Bankman-Fried was going to be able to run a successful business over time. He was extremely good at probabilistic trading and spotting exploitable market inefficiencies, and extremely bad at essentially every other aspect of living in a society with other people, other than a hit-or-miss ability to charm that worked much better with large audiences than one-on-one. The real question was why anyone would ever entrust this man with millions of dollars or decide to work for him for longer than two weeks. The answer to those questions changes over the course of this story. Later on, it was timing. Sam Bankman-Fried took the techniques of high frequency trading he learned at Jane Street Capital and applied them to exploiting cryptocurrency markets at precisely the right time in the cryptocurrency bubble. There was far more money than sense, the most ruthless financial players were still too leery to get involved, and a rising tide was lifting all boats, even the ones that were piles of driftwood. When cryptocurrency inevitably collapsed, so did his businesses. In retrospect, that seems inevitable. The early answer, though, was effective altruism. A full discussion of effective altruism is beyond the scope of this review, although Lewis offers a decent introduction in the book. The short version is that a sensible and defensible desire to use stronger standards of evidence in evaluating charitable giving turned into a bizarre navel-gazing exercise in making up statistical risks to hypothetical future people and treating those made-up numbers as if they should be the bedrock of one's personal ethics. One of the people most responsible for this turn is an Oxford philosopher named Will MacAskill. Sam Bankman-Fried was already obsessed with utilitarianism, in part due to his parents' philosophical beliefs, and it was a presentation by Will MacAskill that converted him to the effective altruism variant of extreme utilitarianism. In Lewis's presentation, this was like joining a cult. The impression I came away with feels like something out of a science fiction novel: Bankman-Fried knew there was some serious gap in his thought processes where most people had empathy, was deeply troubled by this, and latched on to effective altruism as the ethical framework to plug into that hole. So much of effective altruism sounds like a con game that it's easy to think the participants are lying, but Lewis clearly believes Bankman-Fried is a true believer. He appeared to be sincerely trying to make money in order to use it to solve existential threats to society, he does not appear to be motivated by money apart from that goal, and he was following through (in bizarre and mostly ineffective ways). I find this particularly believable because effective altruism as a belief system seems designed to fit Bankman-Fried's personality and justify the things he wanted to do anyway. Effective altruism says that empathy is meaningless, emotion is meaningless, and ethical decisions should be made solely on the basis of expected value: how much return (usually in safety) does society get for your investment. Effective altruism says that all the things that Sam Bankman-Fried was bad at were useless and unimportant, so he could stop feeling bad about his apparent lack of normal human morality. The only thing that mattered was the thing that he was exceptionally good at: probabilistic reasoning under uncertainty. And, critically to the foundation of his business career, effective altruism gave him access to investors and a recruiting pool of employees, things he was entirely unsuited to acquiring the normal way. There's a ton more of this book that I haven't touched on, but this review is already quite long, so I'll leave you with one more point. I don't know how true Lewis's portrayal is in all the details. He took the approach of getting very close to most of the major players in this drama and largely believing what they said happened, supplemented by startling access to sources like Bankman-Fried's personal diary and Caroline Ellis's personal diary. (He also seems to have gotten extensive information from the personal psychiatrist of most of the people involved; I'm not sure if there's some reasonable explanation for this, but based solely on the material in this book, it seems to be a shocking breach of medical ethics.) But Lewis is a storyteller more than he's a reporter, and his bias is for telling a great story. It's entirely possible that the events related here are not entirely true, or are skewed in favor of making a better story. It's certainly true that they're not the complete story. But, that said, I think a book like this is a useful counterweight to the human tendency to believe in moral villains. This is, frustratingly, a counterweight extended almost exclusively to higher-class white people like Bankman-Fried. This is infuriating, but that doesn't make it wrong. It means we should extend that analysis to more people. Once FTX collapsed, a lot of people became very invested in the idea that Bankman-Fried was a straightforward embezzler. Either he intended from the start to steal everyone's money or, more likely, he started losing money, panicked, and stole customer money to cover the hole. Lots of people in history have done exactly that, and lots of people involved in cryptocurrency have tenuous attachments to ethics, so this is a believable story. But people are complicated, and there's also truth in the maxim that every villain is the hero of their own story. Lewis is after a less boring story than "the crook stole everyone's money," and that leads to some bias. But sometimes the less boring story is also true. Here's the thing: even if Sam Bankman-Fried never intended to take any money, he clearly did intend to mix customer money with Alameda Research funds. In Lewis's account, he never truly believed in them as separate things. He didn't care about following accounting or reporting rules; he thought they were boring nonsense that got in his way. There is obvious criminal intent here in any reading of the story, so I don't think Lewis's more complex story would let him escape prosecution. He refused to follow the rules, and as a result a lot of people lost a lot of money. I think it's a useful exercise to leave mental space for the possibility that he had far less obvious reasons for those actions than that he was a simple thief, while still enforcing the laws that he quite obviously violated. This book was great. If you like Lewis's style, this was some of the best entertainment I've read in a while. Highly recommended; if you are at all interested in this saga, I think this is a must-read. Rating: 9 out of 10

20 October 2023

Freexian Collaborators: Debian Contributions: Freexian meetup, debusine updates, lpr/lpd in Debian, and more! (by Utkarsh Gupta, Stefano Rivera)

Contributing to Debian is part of Freexian s mission. This article covers the latest achievements of Freexian and their collaborators. All of this is made possible by organizations subscribing to our Long Term Support contracts and consulting services.

Freexian Meetup, by Stefano Rivera, Utkarsh Gupta, et al. During DebConf, Freexian organized a meetup for its collaborators and those interested in learning more about Freexian and its services. It was well received and many people interested in Freexian showed up. Some developers who were interested in contributing to LTS came to get more details about joining the project. And some prospective customers came to get to know us and ask questions. Sadly, the tragic loss of Abraham shook DebConf, both individually and structurally. The meetup got rescheduled to a small room without video coverage. With that, we still had a wholesome interaction and here s a quick picture from the meetup taken by Utkarsh (which is also why he s missing!).

Debusine, by Rapha l Hertzog, et al. Freexian has been investing into debusine for a while, but development speed is about to increase dramatically thanks to funding from SovereignTechFund.de. Rapha l laid out the 5 milestones of the funding contract, and filed the issues for the first milestone. Together with Enrico and Stefano, they established a workflow for the expanded team. Among the first steps of this milestone, Enrico started to work on a developer-friendly description of debusine that we can use when we reach out to the many Debian contributors that we will have to interact with. And Rapha l started the design work of the autopkgtest and lintian tasks, i.e. what s the interface to schedule such tasks, what behavior and what associated options do we support? At this point you might wonder what debusine is supposed to be let us try to answer this: Debusine manages scheduling and distribution of Debian-related build and QA tasks to a network of worker machines. It also manages the resulting artifacts and provides the results in an easy to consume way. We want to make it easy for Debian contributors to leverage all the great QA tools that Debian provides. We want to build the next generation of Debian s build infrastructure, one that will continue to reliably do what it already does, but that will also enable distribution-wide experiments, custom package repositories and custom workflows with advanced package reviews. If this all sounds interesting to you, don t hesitate to watch the project on salsa.debian.org and to contribute.

lpr/lpd in Debian, by Thorsten Alteholz During Debconf23, Till Kamppeter presented CPDB (Common Print Dialog Backend), a new way to handle print queues. After this talk it was discussed whether the old lpr/lpd based printing system could be abandoned in Debian or whether there is still demand for it. So Thorsten asked on the debian-devel email list whether anybody uses it. Oddly enough, these old packages are still useful:
  • Within a small network it is easier to distribute a printcap file, than to properly configure cups clients.
  • One of the biggest manufacturers of WLAN router and DSL boxes only supports raw queues when attaching an USB printer to their hardware. Admittedly the CPDB still has problems with such raw queues.
  • The Pharos printing system at MIT is still lpd-based.
As a result, the lpr/lpd stuff is not yet ready to be abandoned and Thorsten will adopt the relevant packages (or rather move them under the umbrella of the debian-printing team). Though it is not planned to develop new features, those packages should at least have a maintainer. This month Thorsten adopted rlpr, an utility for lpd printing without using /etc/printcap. The next one he is working on is lprng, a lpr/lpd printer spooling system. If you know of any other package that is also needed and still maintained by the QA team, please tell Thorsten.

/usr-merge, by Helmut Grohne Discussion about lifting the file move moratorium has been initiated with the CTTE and the release team. A formal lift is dependent on updating debootstrap in older suites though. A significant number of packages can automatically move their systemd unit files if dh_installsystemd and systemd.pc change their installation targets. Unfortunately, doing so makes some packages FTBFS and therefore patches have been filed. The analysis tool, dumat, has been enhanced to better understand which upgrade scenarios are considered supported to reduce false positive bug filings and gained a mode for local operation on a .changes file meant for inclusion in salsa-ci. The filing of bugs from dumat is still manual to improve the quality of reports. Since September, the moratorium has been lifted.

Miscellaneous contributions
  • Rapha l updated Django s backport in bullseye-backports to match the latest security release that was published in bookworm. Tracker.debian.org is still using that backport.
  • Helmut Grohne sent 13 patches for cross build failures.
  • Helmut Grohne performed a maintenance upload of debvm enabling its use in autopkgtests.
  • Helmut Grohne wrote an API-compatible reimplementation of autopkgtest-build-qemu. It is powered by mmdebstrap, therefore unprivileged, EFI-only and will soon be included in mmdebstrap.
  • Santiago continued the work regarding how to make it easier to (automatically) test reverse dependencies. An example of the ongoing work was presented during the Salsa CI BoF at DebConf 23.
    In fact, omniorb-dfsg test pipelines as the above were used for the omniorb-dfsg 4.3.0 transition, verifying how the reverse dependencies (tango, pytango and omnievents) were built and how their autopkgtest jobs run with the to-be-uploaded omniorb-dfsg new release.
  • Utkarsh and Stefano attended and helped run DebConf 23. Also continued winding up DebConf 22 accounting.
  • Anton Gladky did some science team uploads to fix RC bugs.

12 October 2023

Reproducible Builds: Reproducible Builds in September 2023

Welcome to the September 2023 report from the Reproducible Builds project In these reports, we outline the most important things that we have been up to over the past month. As a quick recap, whilst anyone may inspect the source code of free software for malicious flaws, almost all software is distributed to end users as pre-compiled binaries.
Andreas Herrmann gave a talk at All Systems Go 2023 titled Fast, correct, reproducible builds with Nix and Bazel . Quoting from the talk description:

You will be introduced to Google s open source build system Bazel, and will learn how it provides fast builds, how correctness and reproducibility is relevant, and how Bazel tries to ensure correctness. But, we will also see where Bazel falls short in ensuring correctness and reproducibility. You will [also] learn about the purely functional package manager Nix and how it approaches correctness and build isolation. And we will see where Bazel has an advantage over Nix when it comes to providing fast feedback during development.
Andreas also shows how you can get the best of both worlds and combine Nix and Bazel, too. A video of the talk is available.
diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb fixed compatibility with file(1) version 5.45 [ ] and updated some documentation [ ]. In addition, Vagrant Cascadian extended support for GNU Guix [ ][ ] and updated the version in that distribution as well. [ ].
Yet another reminder that our upcoming Reproducible Builds Summit is set to take place from October 31st November 2nd 2023 in Hamburg, Germany. If you haven t been before, our summits are a unique gathering that brings together attendees from diverse projects, united by a shared vision of advancing the Reproducible Builds effort. During this enriching event, participants will have the opportunity to engage in discussions, establish connections and exchange ideas to drive progress in this vital field. If you re interested in joining us this year, please make sure to read the event page, the news item, or the invitation email that Mattia Rizzolo sent out recently, all of which have more details about the event and location. We are also still looking for sponsors to support the event, so please reach out to the organising team if you are able to help. Also note that PackagingCon 2023 is taking place in Berlin just before our summit.
On the Reproducible Builds website, Greg Chabala updated the JVM-related documentation to update a link to the BUILDSPEC.md file. [ ] And Fay Stegerman fixed the builds failing because of a YAML syntax error.

Distribution work In Debian, this month: September saw F-Droid add ten new reproducible apps, and one existing app switched to reproducible builds. In addition, two reproducible apps were archived and one was disabled for a current total of 199 apps published with Reproducible Builds and using the upstream developer s signature. [ ] In addition, an extensive blog post was posted on f-droid.org titled Reproducible builds, signing keys, and binary repos .

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

Testing framework The Reproducible Builds project operates a comprehensive testing framework (available at tests.reproducible-builds.org) in order to check packages and other artifacts for reproducibility. In August, a number of changes were made by Holger Levsen:
  • Disable armhf and i386 builds due to Debian bug #1052257. [ ][ ][ ][ ]
  • Run diffoscope with a lower ionice priority. [ ]
  • Log every build in a simple text file [ ] and create persistent stamp files when running diffoscope to ease debugging [ ].
  • Run schedulers one hour after dinstall again. [ ]
  • Temporarily use diffoscope from the host, and not from a schroot running the tested suite. [ ][ ]
  • Fail the diffoscope distribution test if the diffoscope version cannot be determined. [ ]
  • Fix a spelling error in the email to IRC gateway. [ ]
  • Force (and document) the reconfiguration of all jobs, due to the recent rise of zombies. [ ][ ][ ][ ]
  • Deal with a rare condition when killing processes which should not be there. [ ]
  • Install the Debian backports kernel in an attempt to address Debian bug #1052257. [ ][ ]
In addition, Mattia Rizzolo fixed a call to diffoscope --version (as suggested by Fay Stegerman on our mailing list) [ ], worked on an openQA credential issue [ ] and also made some changes to the machine-readable reproducible metadata, reproducible-tracker.json [ ]. Lastly, Roland Clobus added instructions for manual configuration of the openQA secrets [ ].

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

3 October 2023

Junichi Uekawa: Electronic receipt storage in Japan.

Electronic receipt storage in Japan. Japan also started allowing electronic data for receipts, but had some red tape associated with it. Presumably they were worried about increase in fraud cases. Law amendment that went in effet Jan 2022 made the last annoying bits simpler. We used to be required to sign the paper receipt and scan within 3 days of receiving the receipt. This special requirement is now gone. It took a few years to reach this state but now we are at a similar state as our colleagues in the US. Nice.

21 September 2023

Jonathan McDowell: DebConf23 Writeup

DebConf2023 Logo (I wrote this up for an internal work post, but I figure it s worth sharing more publicly too.) I spent last week at DebConf23, this years instance of the annual Debian conference, which was held in Kochi, India. As usual, DebConf provides a good reason to see a new part of the world; I ve been going since 2004 (Porto Alegre, Brazil), and while I ve missed a few (Mexico, Bosnia, and Switzerland) I ve still managed to make it to instances on 5 continents. This has absolutely nothing to do with work, so I went on my own time + dime, but I figured a brief write-up might prove of interest. I first installed Debian back in 1999 as a machine that was being co-located to operate as a web server / email host. I was attracted by the promise of easy online upgrades (or, at least, upgrades that could be performed without the need to be physically present at the machine, even if they naturally required a reboot at some point). It has mostly delivered on this over the years, and I ve never found a compelling reason to move away. I became a Debian Developer in 2000. As a massively distributed volunteer project DebConf provides an opportunity to find out what s happening in other areas of the project, catch up with team mates, and generally feel more involved and energised to work on Debian stuff. Also, by this point in time, a lot of Debian folk are good friends and it s always nice to catch up with them. On that point, I felt that this year the hallway track was not quite the same as usual. For a number of reasons (COVID, climate change, travel time, we re all getting older) I think fewer core teams are achieving critical mass at DebConf - I was the only member physically present from 2 teams I m involved in, and I d have appreciated the opportunity to sit down with both of them for some in-person discussions. It also means it s harder to use DebConf as a venue for advancing major changes; previously having all the decision makers in the same space for a week has meant it s possible to iron out the major discussion points, smoothing remote implementation after the conference. I m told the mini DebConfs are where it s at for these sorts of meetings now, so perhaps I ll try to attend at least one of those next year. Of course, I also went to a bunch of talks. I have differing levels of comment about each of them, but I ve written up some brief notes below about the ones I remember something about. The comment was made that we perhaps had a lower level of deep technical talks, which is perhaps true but I still think there were a number of high level technical talks that served to pique ones interest about the topic. Finally, this DebConf was the first I m aware of that was accompanied by tragedy; as part of the day trip Abraham Raji, a project member and member of the local team, was involved in a fatal accident.

Talks (videos not yet up for all, but should appear for most)
  • Opening Ceremony
    Not much to say here; welcome to DebConf!
  • Continuous Key-Signing Party introduction
    I ended up running this, as Gunnar couldn t make it. Debian makes heavy use of the OpenPGP web of trust (no mass ability to send out Yubikeys + perform appropriate levels of identity verification), so making sure we re appropriately cross-signed, and linked to local conference organisers, is a dull but important part of the conference. We use a modified keysigning approach where identity verification + fingerprint confirmation happens over the course of the conference, so this session was just to explain how that works and confirm we were all working from the same fingerprint list.
  • State of Stateless - A Talk about Immutability and Reproducibility in Debian
    Stateless OSes seem to be gaining popularity, so I went along to this to see if there was anything of note. It was interesting, but nothing earth shattering - very high level.
  • What s missing so that Debian is finally reproducible?
    Reproducible builds are something I ve been keeping an eye on for a long time, and I continue to be impressed by the work folks are putting into this - both for Debian, and other projects. From a security standpoint reproducible builds provide confidence against trojaned builds, and from a developer standpoint knowing you can build reproducibly helps with not having to keep a whole bunch of binary artefacts around.
  • Hello from keyring-maint
    In the distant past the process of getting your OpenPGP key into the Debian keyring (which is used to authenticate uploads + votes, amongst other things) was a clunky process that was often stalled. This hasn t been the case for at least the past 10 years, but there s still a residual piece of project memory that thinks keyring is a blocker. So as a team we say hi and talk about the fact we do monthly updates and generally are fairly responsive these days.
  • A declarative approach to Linux networking with Netplan
    Debian s /etc/network/interfaces is a fairly basic (if powerful) mechanism for configuring network interfaces. NetworkManager is a better bet for dynamic hosts (i.e. clients), and systemd-network seems to be a good choice for servers (I m gradually moving machines over to it). Netplan tries to provide a unified mechanism for configuring both with a single configuration language. A noble aim, but I don t see a lot of benefit for anything I use - my NetworkManager hosts are highly dynamic (so no need to push shared config) and systemd-network (or /etc/network/interfaces) works just fine on the other hosts. I m told Netplan has more use with more complicated setups, e.g. when OpenVSwitch is involved.
  • Quick peek at ZFS, A too good to be true file system and volume manager.
    People who use ZFS rave about it. I m naturally suspicious of any file system that doesn t come as part of my mainline kernel. But, as a longtime cautious mdraid+lvm+ext4 user I appreciate that there have been advances in the file system space that maybe I should look at, and I ve been trying out btrfs on more machines over the past couple of years. I can t deny ZFS has a bunch of interesting features, but nothing I need/want that I can t get from an mdraid+lvm+btrfs stack (in particular data checksumming + reflinks for dedupe were strong reasons to move to btrfs over ext4).
  • Bits from the DPL
    Exactly what it says on the tin; some bits from the DPL.
  • Adulting
    Enrico is always worth hearing talk; Adulting was no exception. Main takeaway is that we need to avoid trying to run the project on martyrs and instead make sure we build a sustainable project. I ve been trying really hard to accept I just don t have time to take on additional responsibilities, no matter how interesting or relevant they might seem, so this resonated.
  • My life in git, after subversion, after CVS.
    Putting all of your home directory in revision control. I ve never made this leap; I ve got some Ansible playbooks that push out my core pieces of configuration, which is held in git, but I don t actually check this out directly on hosts I have accounts on. Interesting, but not for me.
  • EU Legislation BoF - Cyber Resilience Act, Product Liability Directive and CSAM Regulation
    The CRA seems to be a piece of ill informed legislation that I m going to have to find time to read properly. Discussion was a bit more alarmist than I personally feel is warranted, but it was a short session, had a bunch of folk in it, and even when I removed my mask it was hard to make myself understood.
  • What s new in the Linux kernel (and what s missing in Debian)
    An update from Ben about new kernel features. I m paying less attention to such things these days, so nice to get a quick overview of it all.
  • Intro to SecureDrop, a sort-of Linux distro
    Actually based on Ubuntu, but lots of overlap with Debian as a result, and highly customised anyway. Notable, to me, for using OpenPGP as some of the backend crypto support. I managed to talk to Kunal separately about some of the pain points around that, which was an interesting discussion - they re trying to move from GnuPG to Sequoia, primarily because of the much easier integration and lack of requirement for the more complicated GnuPG features that sometimes get in the way.
  • The Docker(.io) ecosystem in Debian
    I hate Docker. I m sure it s fine if you accept it wants to take over the host machine entirely, but when I ve played around with it that s not been the case. This talk was more about the difficulty of trying to keep a fast moving upstream with lots of external dependencies properly up to date in a stable release. Vendoring the deps and trying to get a stable release exception seems like the least bad solution, but it s a problem that affects a growing number of projects.
  • Chiselled containers
    This was kinda of interesting, but I think I missed the piece about why more granular packaging wasn t an option. The premise is you can take an existing .deb and chisel it into smaller components, which then helps separate out dependencies rather than pulling in as much as the original .deb would. This was touted as being useful, in particular, for building targeted containers. Definitely appealing over custom built userspaces for containers, but in an ideal world I think we d want the information in the main packaging and it becomes a lot of work.
  • Debian Contributors shake-up
    Debian Contributors is a great site for massaging your ego around contributions to Debian; it s also a useful point of reference from a data protection viewpoint in terms of information the project holds about contributors - everything is already public, but the Contributors website provides folk with an easy way to find their own information (with various configurable options about whether that s made public or not). T ssia is working on improving the various data feeds into the site, but realistically this is the responsibility of every Debian service owner.
  • New Member BOF
    I m part of the teams that help get new folk into Debian - primarily as a member of the New Member Front Desk, but also as a mostly inactive Application Manager. It s been a while since we did one of these sessions so the Front Desk/Debian Account Managers that were present did a panel session. Nothing earth shattering came out of it; like keyring-maint this is a team that has historically had problems, but is currently running smoothly.

30 August 2023

Andrew Cater: Building a mirror of various Red Hat oriented "stuff"

Building a mirror for rpm-based distributions.
I've already described in brief how I built a mirror that currently mirrors Debian and Ubuntu on a daily basis. That was relatively straightforward given that I know how to install Debian and configure a basic system without a GUI and the ftpsync scripts are well maintained, I can pull some archives and get one pushed to me such that I've always got up to date copies of Debian and Ubuntu.I wanted to do something similar using Rocky Linux to pull in archives for Almalinux, Rocky Linux, CentOS, CentOS Stream and (optionally) Fedora.(This was originally set up using Red Hat Enterprise Linux on a developer's subscription and rebuilt using Rocky Linux so that the machine could be passed on to someone else if necessary. Red Hat 9.1 has moved to x86_64v2 - on the machine I have (HP Microserver gen 8) 9.1 it fails immediately. It has been rebuilt to use Rocky 8.8).This is a minimal install of Rocky as console only - the machine it's on only has 4G of memory so won't run a GUI reliably. It will run Cockpit so can be remotely administered. One user to run everything - mirror.
Minimal install of Rocky 8.7 from DVD .iso. SELinux is enabled, SSH works for remote access. SELinux had to be tweaked to allow /srv/ the appropriate permissions to be served by nginx. /srv is a large LVM volume rather than a RAID 6 - I didn't have enough disks
Adding nginx, enabling Cockpit and editing the Rocky Linux mirroring scripts resulted in something straightforward to reproduce.

nginx

I cheated and stole large parts of my Debian config. The crucial part to remember is that there is no autoindexing by default and I had to dig to find the correct configuration snippet.

# Load configuration files for the default server block.

include /etc/nginx/default.d/*.conf;

location /
autoindex on;
autoindex_exact_size off;
autoindex_format html;
autoindex_localtime off;
# First attempt to serve request as file, then
# as directory, then fall back to displaying a 404.
try_files $uri $uri/ =404;

Rocky Linux mirroring scripts

Systemd unit file for service

[Unit]
Description=Rocky Linux Mirroring script

[Service]
Type=simple
User=mirror
Group=mirror
ExecStart=/usr/local/bin/rockylinux

[Install]
WantedBy=multi-user.target

Rocky linux timer file
[Unit]
Description=Run Rocky Linux mirroring script daily

[Timer]
OnCalendar=*-*-* 08:13:00
OnCalendar=*-*-* 22:13:00
Persistent=true

[Install]
WantedBy=timers.target

Mirror script

#!/bin/env bash
#
# mirrorsync - Synchronize a Rocky Linux mirror
# By: Dennis Koerner <koerner@netzwerge.de>
#

# The latest version of this script can be found at:
# https://github.com/rocky-linux/rocky-tools
#
# Please read https://docs.rockylinux.org/en/rocky/8/guides/add_mirror_manager
# for further information on setting up a Rocky mirror.
#
# Copyright (c) 2021 Rocky Enterprise Software Foundation

This is a very long script in total.
Crucial parts I changed only listed the mirror to pull from and the place to put it.

# A complete list of mirrors can be found at
# https://mirrors.rockylinux.org/mirrormanager/mirrors/Rocky
src="mirrors.vinters.com::rocky"

# Your local path. Change to whatever fits your system.
# $mirrormodule is also used in syslog output.
mirrormodule="rocky-linux"
dst="/srv/$ mirrormodule "

filelistfile="fullfiletimelist-rocky"
lockfile="/home/mirror/rocky.lockfile"
logfile="/home/mirror/rocky.log"

Logfile looks something like this: the single time spec file is used to check whether another rsync needs to be run
deleting 9.1/plus/x86_64/os/repodata/3585b8b5-90e0-4856-9df2-95f646bc62c7-PRIMARY.xml.gz

sent 606,565 bytes received 38,808,194,155 bytes 44,839,746.64 bytes/sec
total size is 1,072,593,052,385 speedup is 27.64
End: Fri 27 Jan 2023 08:27:49 GMT
fullfiletimelist-rocky unchanged. Not updating at Fri 27 Jan 2023 22:13:16 GMT
fullfiletimelist-rocky unchanged. Not updating at Sat 28 Jan 2023 08:13:16 GMT
It was essentially easier to store fullfiletimelist-rocky in /home/mirror than anywhere else.

Very similar small modifications to the Rocky mirroring scripts were used to mirror the other distributions I'm mirroring. (Almalinux, CentOS, CentOS Stream, EPEL and Rocky Linux).

29 August 2023

Jonathan Dowland: Gazelle Twin

A couple of releases A couple of releases
I discovered Gazelle Twin last year via Stuart Maconie's Freak Zone (two of her tracks ended up on my 2022 Halloween playlist). Through her website I learned of The Horror Show! exhibition at Somerset House1 in London that I managed to visit earlier this year. I've been intending to write a 5-track blog post (a la Underworld, the Cure, Coil) for a while but I have been spurred on by the excellent news that she's got a new album on the way, and, she's performing at the Sage Gateshead in November. Buy tickets now!! Here's the five tracks I recommend to get started:
  1. Anti-Body, from 2014's UNFLESH. I particularly love the percussion. Perc did a good hard-house-style remix on Fleshed Out, the companion remix album. Anti-Body by Gazelle Twin
  2. Fire Leap, from Gazelle Twin and NYX's collaborative album Deep England. The album is a re-interpretation of material from Gazelle Twin's earlier album, Pastoral, with the exception of this track, which is a cover of Paul Giovanni's song from The Wicker Man. There's a common aesthetic in all three works: eerie-folk, England's self-mythologising as seen through a warped and cracked lens. Anti-Body by Gazelle Twin There's a fantastic performance video of the whole album, directed by Iain Forsyth and Jane Pollard, available on YouTube.
  3. Better In My Day, from the aforementioned Pastoral. This track and this album are, I think, less accessible, more challenging than the re-interpreted material. That's not a bad thing: I'm still working on digesting it! This is one of the more abrasive, confrontational tracks. Better In My Day by Gazelle Twin
  4. I am Shell I am Bone, from way back to her first release, The Entire City in 2011. Composed, recorded, self-produced, self-released. It's remarkable to me how different each phase of GT's work are to one another. This album evokes a strong sense of atmosphere and place to me. There's a hint of possible influence of Joy Division's Unknown Pleasures (or New Order's Movement) in places. The b-side to this song is a cover of Joy Division's The Eternal. I Am Shell I Am Bone by Gazelle Twin
  5. GT re-issued This Entire City in 2022 along with a companion-piece EP of newly-released material from the same era, The Wastelands. This isn't cutting room floor stuff, though, as evidenced by the strength of my final pick, Hole in my Heart. Hole In My Heart by Gazelle Twin
It's hard to pick just five tracks when doing these (that's the point I suppose). I note that I haven't picked anything from her wide-ranging soundtrack work: her last three or four of her releases have been soundtrack works, released on the well-respected UK label Invada, as will be her forthcoming album. You can find all the released stuff on Gazelle Twin's Bandcamp Page.

  1. I recently learned that PJ Harvey once hosted album sessions in Somerset House, and allowed fans to watch her at work during the recording: https://www.theguardian.com/music/2015/jan/16/pj-harvey-somerset-house-recording-in-progress

10 August 2023

Freexian Collaborators: Debian Contributions: PTS tracker, DebConf23 Bursary, and more! (by Utkarsh Gupta)

Contributing to Debian is part of Freexian s mission. This article covers the latest achievements of Freexian and their collaborators. All of this is made possible by organizations subscribing to our Long Term Support contracts and consulting services.

tracker.debian.org work by Rapha l Hertzog Rapha l spent some time during his vacation to update distro-tracker to be fully compatible with Django 3.2 so that the codebase and the whole testsuite can run on Debian 12. There s one exception though with the functional test suite that still needs further work to cope with the latest version of Selenium. By dropping support of Django 2.2, Rapha l could also start to work toward support of Django 4.2 since Django helpfully emits deprecation warnings of things that will break in future versions. All the warnings have been fixed but the codebase still fails its testsuite in Django 4.2 because we have to get rid of the python3-django-jsonfield dependency (that is rightfully dead upstream since Django has native support nowadays). All the JSONField have been converted to use Django s native field, but the migration system still requires that dependency at this point. This will require either some fresh reboot of the migration history, or some other trickery to erase the jsonfield dependency from the history of migrations. If you have experience with that, don t hesitate to share it (mail at hertzog@debian.org, or reach out to buxy on IRC). At this point, tracker.debian.org runs with Django 3.2 on Debian 11 since Debian System Administrators are not yet ready to upgrade debian.org hosts to Debian 12.

DebConf 23 bursary work by Utkarsh Gupta Utkarsh led the bursary team this year. The bursary team got a ton of requests this time. Rolling out the results in 4 batches, the bursary team catered over 165 bursary requests - which is superb! The team managed to address all the requests and answered a bit over 120 emails in the process. With that, the bursaries are officially closed for DebConf 2023. The team also intends to roll out some of the statistics closer to DebConf.

Miscellaneous contributions
  • Stefano implemented meson support in pybuild, the tool for building Python packages in Debian against multiple Python versions.
  • Santiago did some work on Salsa CI to enhance the ARM support on the autopkgtest job and make Josch s branch work. MR to come soon.
  • Helmut sent patches for 6 cross build failures.
  • Stefano has been preparing for DebConf 23: Working on the website, and assisting the local teams.
  • Stefano attended the DebConf Video team sprint in Paris, mostly looking at new hardware and software options for video capture and live-mixing. Full sprint report.

5 August 2023

Bits from Debian: Debian Project Bits Volume 1, Issue 1


Debian Project Bits Volume 1, Issue 1 August 05, 2023 Welcome to the inaugural issue of Debian Project Bits! Those remembering the Debian Weekly News (DwN) will recognize some of the sections here which served as our inspiration. Debian Project Bits posts will allow for a faster turnaround of some project news on a monthly basis. The Debian Micronews service will continue to share shorter news items, the Debian Project News remains as our official newsletter which may move to a biannual archive format. News Debian Day The Debian Project was officially founded by Ian Murdock on August 16, 1993. Since then we have celebrated our Anniversary of that date each year with events around the world. We would love it if you could join our revels this very special year as we have the honor of turning 30! Attend or organize a local Debian Day celebration. You're invited to plan your own event: from Bug Squashing parties to Key Signing parties, Meet-Ups, or any type of social event whether large or small. And be sure to check our Debian reimbursement How To if you need such resources. You can share your days, events, thoughts, or notes with us and the rest of the community with the #debianday tag that will be used across most social media platforms. See you then! Events: Upcoming and Reports Upcoming Debian 30 anos The Debian Brasil Community is organizing the event Debian 30 anos to celebrate the 30th anniversary of the Debian Project. From August 14 to 18, between 7pm and 22pm (UTC-3) contributors will talk online in Portuguese and we will live stream on Debian Brasil YouTube channel. DebConf23: Debian Developers Camp and Conference The 2023 Debian Developers Camp (DebCamp) and Conference (DebConf23) will be hosted this year in Infopark, Kochi, India. DebCamp is slated to run from September 3 through 9, immediately followed by the larger DebConf, September 10 through 17. If you are planning on attending the conference this year, now is the time to ensure your travel documentation, visa information, bursary submissions, papers and relevant equipment are prepared. For more information contact: debconf@debconf. MiniDebConf Cambridge 2023 There will be a MiniDebConf held in Cambridge, UK, hosted by ARM for 4 days in November: 2 days for a mini-DebCamp (Thu 23 - Fri 24), with space for dedicated development / sprint / team meetings, then two days for a more regular MiniDebConf (Sat 25 - Sun 26) with space for more general talks, up to 80 people. Reports During the last months, the Debian Community has organized some Bug Squashing Parties:
Tilburg, Netherlands. October 2022. St-Cergue, Switzerland. January 2023 Montreal, Canada. February 2023 In January, Debian India hosted the MiniDebConf Tamil Nadu in Viluppuram, Tamil Nadu, India (Sat 28 - Sun 26). The following month, the MiniDebConf Portugal 2023 was held in Lisbon (12 - 16 February 2023). These events, seen as a stunning success by some of their attendees, demonstrate the vitality of our community.
Debian Brasil Community at Campus Party Brazil 2023 Another edition of Campus Party Brazil took place in the city of S o Paulo between July 25th and 30th. And one more time the Debian Brazil Community was present. During the days in the available space, we carry out some activities such as: For more info and a few photos, check out the organizers' report. MiniDebConf Bras lia 2023 From May 25 to 27, Bras lia hosted the MiniDebConf Bras lia 2023. This gathering was composed of various activities such as talks, workshops, sprints, BSPs (Bug Squashing Party), key signings, social events, and hacking, aimed to bring the community together and celebrate the world's largest Free Software project: Debian. For more information please see the full report written by the organizers. Debian Reunion Hamburg 2023 This year the annual Debian Reunion Hamburg was held from Tuesday 23 to 30 May starting with four days of hacking followed by two days of talks, and then two more days of hacking. As usual, people - more than forty-five attendees from Germany, Czechia, France, Slovakia, and Switzerland - were happy to meet in person, to hack and chat together, and much more. If you missed the live streams, the video recordings are available. Translation workshops from the pt_BR team The Brazilian translation team, debian-l10n-portuguese, had their first workshop of 2023 in February with great results. The workshop was aimed at beginners, working in DDTP/DDTSS. For more information please see the full report written by the organizers. And on June 13 another workshop took place to translate The Debian Administrator's Handbook). The main goal was to show beginners how to collaborate in the translation of this important material, which has existed since 2004. The manual's translations are hosted on Weblate. Releases Stable Release Debian 12 bookworm was released on June 10, 2023. This new version becomes the stable release of Debian and moves the prior Debian 11 bullseye release to oldstable status. The Debian community celebrated the release with 23 Release Parties all around the world. Bookworm's first point release 12.1 address miscellaneous bug fixes affecting 88 packages, documentation, and installer updates was made available on July 22, 2023. RISC-V support riscv64 has recently been added to the official Debian architectures for support of 64-bit little-endian RISC-V hardware running the Linux kernel. We expect to have full riscv64 support in Debian 13 trixie. Updates on bootstrap, build daemon, porterbox, and development progress were recently shared by the team in a Bits from the Debian riscv64 porters post. non-free-firmware The Debian 12 bookworm archive now includes non-free-firmware; please be sure to update your apt sources.list if your systems requires such components for operation. If your previous sources.list included non-free for this purpose it may safely be removed. apt sources.list The Debian archive holds several components: Example of the sources.list file
deb http://deb.debian.org/debian bookworm main
deb-src http://deb.debian.org/debian bookworm main
deb http://deb.debian.org/debian-security/ bookworm-security main
deb-src http://deb.debian.org/debian-security/ bookworm-security main
deb http://deb.debian.org/debian bookworm-updates main
deb-src http://deb.debian.org/debian bookworm-updates main
Example using the components:
deb http://deb.debian.org/debian bookworm main non-free-firmware
deb-src http://deb.debian.org/debian bookworm main non-free-firmware
deb http://deb.debian.org/debian-security/ bookworm-security main non-free-firmware
deb-src http://deb.debian.org/debian-security/ bookworm-security main non-free-firmware
deb http://deb.debian.org/debian bookworm-updates main non-free-firmware
deb-src http://deb.debian.org/debian bookworm-updates main non-free-firmware
For more information and guidelines on proper configuration of the apt source.list file please see the Configuring Apt Sources - Wiki page. Inside Debian New Debian Members Please welcome the following newest Debian Project Members: To find out more about our newest members or any Debian Developer, look for them on the Debian People list. Security Debian's Security Team releases current advisories on a daily basis. Some recently released advisories concern these packages: trafficserver Several vulnerabilities were discovered in Apache Traffic Server, a reverse and forward proxy server, which could result in information disclosure or denial of service. asterisk A flaw was found in Asterisk, an Open Source Private Branch Exchange. A buffer overflow vulnerability affects users that use PJSIP DNS resolver. This vulnerability is related to CVE-2022-24793. The difference is that this issue is in parsing the query record parse_query(), while the issue in CVE-2022-24793 is in parse_rr(). A workaround is to disable DNS resolution in PJSIP config (by setting nameserver_count to zero) or use an external resolver implementation instead. flask It was discovered that in some conditions the Flask web framework may disclose a session cookie. chromium Multiple security issues were discovered in Chromium, which could result in the execution of arbitrary code, denial of service or information disclosure. Other Popular packages gpgv - GNU privacy guard signature verification tool. 99,053 installations. gpgv is actually a stripped-down version of gpg which is only able to check signatures. It is somewhat smaller than the fully-blown gpg and uses a different (and simpler) way to check that the public keys used to make the signature are valid. There are no configuration files and only a few options are implemented. dmsetup - Linux Kernel Device Mapper userspace library. 77,769 installations. The Linux Kernel Device Mapper is the LVM (Linux Logical Volume Management) Team's implementation of a minimalistic kernel-space driver that handles volume management, while keeping knowledge of the underlying device layout in user-space. This makes it useful for not only LVM, but software raid, and other drivers that create "virtual" block devices. sensible-utils - Utilities for sensible alternative selection. 96,001 daily users. This package provides a number of small utilities which are used by programs to sensibly select and spawn an appropriate browser, editor, or pager. The specific utilities included are: sensible-browser sensible-editor sensible-pager. popularity-contest - The popularity-contest package. 90,758 daily users. The popularity-contest package sets up a cron job that will periodically anonymously submit to the Debian developers statistics about the most used Debian packages on the system. This information helps Debian make decisions such as which packages should go on the first CD. It also lets Debian improve future versions of the distribution so that the most popular packages are the ones which are installed automatically for new users. New and noteworthy packages in unstable Toolkit for scalable simulation of distributed applications SimGrid is a toolkit that provides core functionalities for the simulation of distributed applications in heterogeneous distributed environments. SimGrid can be used as a Grid simulator, a P2P simulator, a Cloud simulator, a MPI simulator, or a mix of all of them. The typical use-cases of SimGrid include heuristic evaluation, application prototyping, and real application development and tuning. This package contains the dynamic libraries and runtime. LDraw mklist program 3D CAD programs and rendering programs using the LDraw parts library of LEGO parts rely on a file called parts.lst containing a list of all available parts. The program ldraw-mklist is used to generate this list from a directory of LDraw parts. Open Lighting Architecture - RDM Responder Tests The DMX512 standard for Digital MultipleX is used for digital communication networks commonly used to control stage lighting and effects. The Remote Device Management protocol is an extension to DMX512, allowing bi-directional communication between RDM-compliant devices without disturbing other devices on the same connection. The Open Lighting Architecture (OLA) provides a plugin framework for distributing DMX512 control signals. The ola-rdm-tests package provides an automated way to check protocol compliance in RDM devices. parsec-service Parsec is an abstraction layer that can be used to interact with hardware-backed security facilities such as the Hardware Security Module (HSM), the Trusted Platform Module (TPM), as well as firmware-backed and isolated software services. The core component of Parsec is the security service, provided by this package. The service is a background process that runs on the host platform and provides connectivity with the secure facilities of that host, exposing a platform-neutral API that can be consumed into different programming languages using a client library. For a client library implemented in Rust see the package librust-parsec-interface-dev. Simple network calculator and lookup tool Process and lookup network addresses from the command line or CSV with ripalc. Output has a variety of customisable formats. High performance, open source CPU/GPU miner and RandomX benchmark XMRig is a high performance, open source, cross platform RandomX, KawPow, CryptoNight, and GhostRider unified CPU/GPU miner and RandomX benchmark. Ping, but with a graph - Rust source code This package contains the source for the Rust gping crate, packaged by debcargo for use with cargo and dh-cargo. Once upon a time in Debian: 2014-07-31 The Technical committee choose libjpeg-turbo as the default JPEG decoder. 2010-08-01 DebConf10 starts New York City, USA 2007-08-05 Debian Maintainers approved by vote 2009-08-05 Jeff Chimene files bug #540000 against live-initramfs. Calls for help The Publicity team calls for volunteers and help! Your Publicity team is asking for help from you our readers, developers, and interested parties to contribute to the Debian news effort. We implore you to submit items that may be of interest to our community and also ask for your assistance with translations of the news into (your!) other languages along with the needed second or third set of eyes to assist in editing our work before publishing. If you can share a small amount of your time to aid our team which strives to keep all of us informed, we need you. Please reach out to us via IRC on #debian-publicity on OFTC.net, or our public mailing list, or via email at press@debian.org for sensitive or private inquiries.

4 August 2023

Shirish Agarwal: License Raj 2.0, 2023

About a week back Jio launched a laptop called JioBook that will be manufactured in China
The most interesting thing is that the whole thing will be produced in Hunan, China. Then 3 days later India mandates a licensing requirement for Apple, Dell and other laptop/tablet manufacturers. And all of these in the guise of Make in India . It is similar how India has exempted Adani and the Tatas from buying as much solar cells as are needed and then sell the same in India. Reliance will be basically monopolizing the laptop business. And if people think that projects like Raspberry Pi, Arduino etc. will be exempted they have another think coming.

History of License Raj After India became free, in the 1980s the Congress wanted to open its markets to the world just like China did. But at that time, the BJP, though small via Jan Sangh made the argument that we are not ready for the world. The indian businessman needs a bit more time. And hence a compromise was made. The compromise was simple. Indian Industry and people who wanted to get anything from the west, needed a license. This was very much in line how the Russian economy was evolving. All the three nations, India, China and Russia were on similar paths. China broke away where it opened up limited markets for competition and gave state support to its firms. Russia and Japan on the other hand, kept their markets relatively closed. The same thing happened in India, what happened in Russia and elsewhere. The businessman got what he wanted, he just corrupted the system. Reliance, the conglomerate today abused the same system as much as it could. Its defence was to be seen as the small guy. I wouldn t go into that as that itself would be a big story in itself. Whatever was sold in India was sold with huge commissions and just like Russia scarcity became the order of the day. Monopolies flourished and competition was nowhere. These remained till 1991 when Prime Minister Mr. Manmohan Singh was forced to liberalize and open up the markets. Even at that time, the RSS through its Swadeshi Jagran Manch was sharing the end of the world prophecies for the Indian businessman.

2014 Current Regime In 2010, in U.K. the Conservative party came in power under the leadership of David Cameron who was influenced by the policies of Margaret Thatcher who arguably ditched manufacturing in the UK. David Cameron and his party did the same 2010 onwards but for public services under the name austerity. India has been doing the same. The inequality has gone up while people s purchasing power has gone drastically down. CMIE figures are much more drastic and education is a joke.
Add to that since 2016 funding for scientists have gone to the dogs and now they are even playing with doctor s careers. I do not have to remind people that a woman scientist took almost a quarter century to find a drug delivery system that others said was impossible. And she did it using public finance. Science is hard. I have already shared in a previous blog post how it took the Chinese 20 years to reach where they are and somehow we think we will be able to both China and Japan. Of the 160 odd countries that are on planet earth, only a handful of countries have both the means and the knowledge to use and expand on that. While I was not part of Taiwan Debconf, later I came to know that even Taiwan in many ways is similar to Japan in the sense that a majority of its population is stuck in low-paid jobs (apart from those employed in TSMC) which is similar to Keiretsu or Chabeol from either Japan or South Korea. In all these cases, only a small percentage of the economy is going forward while the rest is stagnating or even going backwards. Similar is the case in India as well  Unlike the Americans who chose the path to have more competition, we have chosen the path to have more monopolies. So even though, I very much liked Louis es project sooner or later finding the devices itself would be hard. While the recent notification is for laptops, what stops them from doing the same with mobiles or even desktop systems. As it is, both smartphones as well as desktop systems has been contracting since last year as food inflation has gone up. Add to that availability of products has been made scarce (whether by design or not, unknown.) The end result, the latest processor launched overseas becomes the new thing here 3-4 years later. And that was before this notification. This will only decrease competition and make Ambanis rich at cost of everyone else. So much for east of doing business . Also the backlash has been pretty much been tepid. So what I shared will probably happen again sooner or later. The only interesting thing is that it s based on Android, probably in part due to the issues people seeing in both Windows 10, 11 and whatnot. Till later. Update :- The print tried a decluttering but instead cluttered the topic. While what he shared all was true, and certainly it is a step backwards but he didn t need to show how most Indians had to go to RBI for the same. I remember my Mamaji doing the same and sharing afterwards that all he had was $100 for a day which while being a big sum was paltry if you were staying in a hotel and were there for company business. He survived on bananas and whatver cheap veg. he could find then. This is almost 35-40 odd years ago. As shared the Govt. has been doing missteps for quite sometime now. The print does try to take a balanced take so it doesn t run counter of the Government but even it knows that this is a bad take. The whole thing about security is just laughable, did they wake up after 9 years. And now in its own wisdom it apparently has shifted the ban instead from now to 3 months afterwards. Of course, most people on the right just applauding without understanding the complexities and implications of the same. Vendors like Samsung and Apple who have made assembly operations would do a double-think and shift to Taiwan, Vietnam, Mexico anywhere. Global money follows global trends. And such missteps do not help

Implications in A.I. products One of the things that has not been thought about how companies that are making A.I. products in India or even MNC s will suffer. Most of them right now are in stealth mode but are made for Intel or AMD or ARM depending upon how it works for them. There is nothing to tell if the companies made their plea and was it heard or unheard. If the Government doesn t revert it then sooner or later they would either have to go abroad or cash out/sell to somebody else. Some people on the right also know this but for whatever reason have chosen to remain silent. Till later

Reproducible Builds: Reproducible Builds in July 2023

Welcome to the July 2023 report from the Reproducible Builds project. In our reports, we try to outline the most important things that we have been up to over the past month. As ever, if you are interested in contributing to the project, please visit the Contribute page on our website.
Marcel Fourn et al. presented at the IEEE Symposium on Security and Privacy in San Francisco, CA on The Importance and Challenges of Reproducible Builds for Software Supply Chain Security. As summarised in last month s report, the abstract of their paper begins:
The 2020 Solarwinds attack was a tipping point that caused a heightened awareness about the security of the software supply chain and in particular the large amount of trust placed in build systems. Reproducible Builds (R-Bs) provide a strong foundation to build defenses for arbitrary attacks against build systems by ensuring that given the same source code, build environment, and build instructions, bitwise-identical artifacts are created. (PDF)

Chris Lamb published an interview with Simon Butler, associate senior lecturer in the School of Informatics at the University of Sk vde, on the business adoption of Reproducible Builds. (This is actually the seventh instalment in a series featuring the projects, companies and individuals who support our project. We started this series by featuring the Civil Infrastructure Platform project, and followed this up with a post about the Ford Foundation as well as recent ones about ARDC, the Google Open Source Security Team (GOSST), Bootstrappable Builds, the F-Droid project and David A. Wheeler.) Vagrant Cascadian presented Breaking the Chains of Trusting Trust at FOSSY 2023.
Rahul Bajaj has been working with Roland Clobus on merging an overview of environment variations to our website:
I have identified 16 root causes for unreproducible builds in my empirical study, which I have linked to the corresponding documentation. The initial MR right now contains information about 10 root causes. For each root cause, I have provided a definition, a notable instance, and a workaround. However, I have only found workarounds for 5 out of the 10 root causes listed in this merge request. In the upcoming commits, I plan to add an additional 6 root causes. I kindly request you review the text for any necessary refinements, modifications, or corrections. Additionally, I would appreciate the help with documentation for the solutions/workarounds for the remaining root causes: Archive Metadata, Build ID, File System Ordering, File Permissions, and Snippet Encoding. Your input on the identified root causes for unreproducible builds would be greatly appreciated. [ ]

Just a reminder that our upcoming Reproducible Builds Summit is set to take place from October 31st November 2nd 2023 in Hamburg, Germany. Our summits are a unique gathering that brings together attendees from diverse projects, united by a shared vision of advancing the Reproducible Builds effort. During this enriching event, participants will have the opportunity to engage in discussions, establish connections and exchange ideas to drive progress in this vital field. If you re interested in joining us this year, please make sure to read the event page which has more details about the event and location.
There was more progress towards making the Go programming language ecosystem reproducible this month, including: In addition, kpcyrd posted to our mailing list to report that:
while packaging govulncheck for Arch Linux I noticed a checksum mismatch for a tar file I downloaded from go.googlesource.com. I used diffoscope to compare the .tar file I downloaded with the .tar file the build server downloaded, and noticed the timestamps are different.

In Debian, 20 reviews of Debian packages were added, 25 were updated and 25 were removed this month adding to our knowledge about identified issues. A number of issue types were updated, including marking ffile_prefix_map_passed_to_clang being fixed since Debian bullseye [ ] and adding a Debian bug tracker reference for the nondeterminism_added_by_pyqt5_pyrcc5 issue [ ]. In addition, Roland Clobus posted another detailed update of the status of reproducible Debian ISO images on our mailing list. In particular, Roland helpfully summarised that live images are looking good, and the number of (passing) automated tests is growing .
Bernhard M. Wiedemann published another monthly report about reproducibility within openSUSE.
F-Droid added 20 new reproducible apps in July, making 165 apps in total that are published with Reproducible Builds and using the upstream developer s signature. [ ]
The Sphinx documentation tool recently accepted a change to improve deterministic reproducibility of documentation. It s internal util.inspect.object_description attempts to sort collections, but this can fail. The change handles the failure case by using string-based object descriptions as a fallback deterministic sort ordering, as well as adding recursive object-description calls for list and tuple datatypes. As a result, documentation generated by Sphinx will be more likely to be automatically reproducible. Lastly in news, kpcyrd posted to our mailing list announcing a new repro-env tool:
My initial interest in reproducible builds was how do I distribute pre-compiled binaries on GitHub without people raising security concerns about them . I ve cycled back to this original problem about 5 years later and built a tool that is meant to address this. [ ]

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:
In diffoscope development this month, versions 244, 245 and 246 were uploaded to Debian unstable by Chris Lamb, who also made the following changes:
  • Don t include the file size in image metadata. It is, at best, distracting, and it is already in the directory metadata. [ ]
  • Add compatibility with libarchive-5. [ ]
  • Mark that the test_dex::test_javap_14_differences test requires the procyon tool. [ ]
  • Initial work on DOS/MBR extraction. [ ]
  • Move to using assert_diff in the .ico and .jpeg tests. [ ]
  • Temporarily mark some Android-related as XFAIL due to Debian bugs #1040941 & #1040916. [ ]
  • Fix the test skipped reason generation in the case of a version outside of the required range. [ ]
  • Update copyright years. [ ][ ]
  • Fix try.diffoscope.org. [ ]
In addition, Gianfranco Costamagna added support for LLVM version 16. [ ]

Testing framework The Reproducible Builds project operates a comprehensive testing framework (available at tests.reproducible-builds.org) in order to check packages and other artifacts for reproducibility. In July, a number of changes were made by Holger Levsen:
  • General changes:
    • Upgrade Jenkins host to Debian bookworm now that Debian 12.1 is out. [ ][ ][ ][ ]
    • djm: improve UX when rebooting a node fails. [ ]
    • djm: reduce wait time between rebooting nodes. [ ]
  • Debian-related changes:
    • Various refactoring of the Debian scheduler. [ ][ ][ ]
    • Make Debian live builds more robust with respect to salsa.debian.org returning HTTP 502 errors. [ ][ ]
    • Use the legacy SCP protocol instead of the SFTP protocol when transfering Debian live builds. [ ][ ]
    • Speed up a number of database queries thanks, Myon! [ ][ ][ ][ ][ ]
    • Split create_meta_pkg_sets job into two (for Debian unstable and Debian testing) to half the job runtime to approximately 90 minutes. [ ][ ]
    • Split scheduler job into four separate jobs, one for each tested architecture. [ ][ ]
    • Treat more PostgreSQL errors as serious (for some jobs). [ ]
    • Re-enable automatic database documentation now that postgresql_autodoc is back in Debian bookworm. [ ]
    • Remove various hardcoding of Debian release names. [ ]
    • Drop some i386 special casing. [ ]
  • Other distributions:
    • Speed up Alpine SQL queries. [ ]
    • Adjust CSS layout for Arch Linux pages to match 3 and not 4 repos being tested. [ ]
    • Drop the community Arch Linux repo as it has now been merged into the extra repo. [ ]
    • Speed up a number of Arch-related database queries. [ ]
    • Try harder to properly cleanup after building OpenWrt packages. [ ]
    • Drop all kfreebsd-related tests now that it s officially dead. [ ]
  • System health:
    • Always ignore some well-known harmless orphan processes. [ ][ ][ ]
    • Detect another case of job failure due to Jenkins shutdown. [ ]
    • Show all non co-installable package sets on the status page. [ ]
    • Warn that some specific reboot nodes are currently false-positives. [ ]
  • Node health checks:
    • Run system and node health checks for Jenkins less frequently. [ ]
    • Try to restart any failed dpkg-db-backup [ ] and munin-node services [ ].
In addition, Vagrant Cascadian updated the paths in our automated to tests to use the same paths used by the official Debian build servers. [ ]

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

1 August 2023

Reproducible Builds: Supporter spotlight: Simon Butler on business adoption of Reproducible Builds

The Reproducible Builds project relies on several projects, supporters and sponsors for financial support, but they are also valued as ambassadors who spread the word about our project and the work that we do. This is the seventh instalment in a series featuring the projects, companies and individuals who support the Reproducible Builds project. We started this series by featuring the Civil Infrastructure Platform project, and followed this up with a post about the Ford Foundation as well as recent ones about ARDC, the Google Open Source Security Team (GOSST), Bootstrappable Builds, the F-Droid project and David A. Wheeler. Today, however, we will be talking with Simon Butler, an associate senior lecturer in the School of Informatics at the University of Sk vde, where he undertakes research in software engineering that focuses on IoT and open source software, and contributes to the teaching of computer science to undergraduates.

Chris: For those who have not heard of it before, can you tell us more about the School of Informatics at Sk vde University? Simon: Certainly, but I may be a little long-winded. Sk vde is a city in the area between the two large lakes in southern Sweden. The city is a busy place. Sk vde is home to the regional hospital, some of Volvo s manufacturing facilities, two regiments of the Swedish defence force, a lot of businesses in the Swedish computer games industry, other tech companies and more. The University of Sk vde is relatively small. Sweden s large land area and low population density mean that regional centres such as Sk vde are important and local universities support businesses by training new staff and supporting innovation. The School of Informatics has two divisions. One focuses on teaching and researching computer games. The other division encompasses a wider range of teaching and research, including computer science, web development, computer security, network administration, data science and so on.
Chris: You recently had a open-access paper published in Software Quality Journal. Could you tell us a little bit more about it and perhaps briefly summarise its key findings? Simon: The paper is one output of a collaborative research project with six Swedish businesses that use open source software. There are two parts to the paper. The first consists of an analysis of what the group of businesses in the project know about Reproducible Builds (R-Bs), their experiences with R-Bs and their perception of the value of R-Bs to the businesses. The second part is an interview study with business practitioners and others with experience and expertise in R-Bs. We set out to try to understand the extent to which software-intensive businesses were aware of R-Bs, the technical and business reasons they were or were not using R-Bs and to document the business and technical use cases for R-Bs. The key findings were that businesses are aware of R-Bs, and some are using R-Bs as part of their day-to-day development process. Some of the uses for R-Bs we found were not previously documented. We also found that businesses understood the value R-Bs have as part of engineering and software quality processes. They are also aware of the costs of implementing R-Bs and that R-Bs are an intangible value proposition - in other words, businesses can add value through process improvement by using R-Bs. But, that, currently at least, R-Bs are not a selling point for software or products.
Chris: You performed a large number of interviews in order to prepare your paper. What was the most surprising response to you? Simon: Most surprising is a good question. Everybody I spoke to brought something new to my understanding of R-Bs, and many responses surprised me. The interviewees that surprised me most were I01 and I02 (interviews were anonymised and interviewees were assigned numeric identities). I02 described the sceptical perspective that there is a viable, pragmatic alternative to R-Bs - verifiable builds - which I was aware of before undertaking the research. The company had developed a sufficiently robust system for their needs and worked well. With a large archive of software used in production, they couldn t justify the cost of retrofitting a different solution that might only offer small advantages over the existing system. Doesn t really sound too surprising, but the interview was one of the first I did on this topic, and I was very focused on the value of, and need for, trust in a system that motivated the R-B. The solution used by the company requires trust, but they seem to have established sufficient trust for their needs by securing their build systems to the extent that they are more or less tamper-proof. The other big surprise for me was I01 s use of R-Bs to support the verification of system configuration in a system with multiple embedded components at boot time. It s such an obvious application of R-Bs, and exactly the kind of response I hoped to get from interviewees. However, it is another instance of a solution where trust is only one factor. In the first instance, the developer is using R-Bs to establish trust in the toolchain. There is also the second application that the developer can use a set of R-Bs to establish that deployed system consists of compatible components. While this might not sound too significant, there appear to be some important potential applications. One that came to mind immediately is a problem with firmware updates on nodes in IoT systems where the node needs to update quickly with limited downtime and without failure. The node also needs to be able to roll back any update proposed by a server if there are conflicts with the current configuration or if any tests on the node fail. Perhaps the chances of failure could be reduced, if a node can instead negotiate with a server to determine a safe path to migrate from its current configuration to a working configuration with the upgraded components the central system requires? Another potential application appears to be in the configuration management of AI systems, where decisions need to be explainable. A means of specifying validated configurations of training data, models and deployed systems might, perhaps, be leveraged to prevent invalid or broken configurations from being deployed in production.
Chris: One of your findings was that reproducible builds were perceived to be good engineering practice . To what extent do you believe cultural forces affect the adoption or rejection of a given technology or practice? Simon: To a large extent. People s decisions are informed by cultural norms, and business decisions are made by people acting collectively. Of course, decision-making, including assessments of risk and usefulness, is mediated by individual positions on the continuum from conformity to non-conformity, as well as individual and in-group norms. Whether a business will consider a given technology for adoption will depend on cultural forces. The decision to adopt may well be made on the grounds of cost and benefits.
Chris: Another conclusion implied by your research is that businesses are often dealing with software deployment lifespans (eg. 20+ years) that differ from widely from those of the typical hobbyist programmer. To what degree do you think this temporal mismatch is a problem for both groups? Simon: This is a fascinating question. Long-term software maintenance is a requirement in some industries because of the working lifespans of the products and legal requirements to maintain the products for a fixed period. For some other industries, it is less of a problem. Consequently, I would tend to divide developers into those who have been exposed to long-term maintenance problems and those who have not. Although, more professional than hobbyist developers will have been exposed to the problem. Nonetheless, there are areas, such as music software, where there are also long-term maintenance challenges for data formats and software.
Chris: Based on your research, what would you say are the biggest blockers for the adoption of reproducible builds within business ? And, based on this, would you have any advice or recommendations for the broader reproducible builds ecosystem? Simon: From the research, the main blocker appears to be cost. Not an absolute cost, but there is an overhead to introducing R-Bs. Businesses (and thus business managers) need to understand the business case for R-Bs. Making decision-makers in businesses aware of R-Bs and that they are valuable will take time. Advocacy at multiple levels appears to be the way forward and this is being done. I would recommend being persistent while being patient and to keep talking about reproducible builds. The work done in Linux distributions raises awareness of R-Bs amongst developers. Guix, NixOS and Software Heritage are all providing practical solutions and getting attention - I ve been seeing progressively more mentions of all three during the last couple of years. Increased awareness amongst developers should lead to more interest within companies. There is also research money being assigned to supply chain security and R-B s. The CHAINS project at KTH in Stockholm is one example of a strategic research project. There may be others that I m not aware of. The policy-level advocacy is slowly getting results in some countries, and where CISA leads, others may follow.
Chris: Was there a particular reason you alighted on the question of the adoption of reproducible builds in business? Do you think there s any truth behind the shopworn stereotype of hacker types neglecting the resources that business might be able to offer? Simon: Much of the motivation for the research came from the contrast between the visibility of R-Bs in open source projects and the relative invisibility of R-Bs in industry. Where companies are known to be using R-Bs (e.g. Google, etc.) there is no fuss, no hype. They were not selling R-Bs as a solution; instead the documentation is very matter-of-fact that R-Bs are part of a customer-facing process in their cloud solutions. An obvious question for me was that if some people use R-B s in software development, why doesn t everybody? There are limits to the tooling for some programming languages that mean R-Bs are difficult or impossible. But where creating an R-B is practical, why are they not used more widely? So, to your second question. There is another factor, which seems to be more about a lack of communication rather than neglecting opportunities. Businesses may not always be willing to discuss their development processes and innovations. Though I do think the increasing number of conferences (big and small) for software practitioners is helping to facilitate more communication and greater exchange of ideas.
Chris: Has your personal view of reproducible builds changed since before you embarked on writing this paper? Simon: Absolutely! In the early stages of the research, I was interested in questions of trust and how R-Bs were applied to resolve build and supply chain security problems. As the research developed, however, I started to see there were benefits to the use of R-Bs that were less obvious and that, in some cases, an R-B can have more than a single application.
Chris: Finally, do you have any plans to do future research touching on reproducible builds? Simon: Yes, definitely. There are a set of problems that interest me. One already mentioned is the use of reproducible builds with AI systems. Interpretable or explainable AI (XAI) is a necessity, and I think that R-Bs can be used to support traceability in the configuration and testing of both deployed systems and systems used during model training and evaluation. I would also like to return to a problem discussed briefly in the article, which is to develop a deeper understanding of the elements involved in the application of R-Bs that can be used to support reasoning about existing and potential applications of R-Bs. For example, R-Bs can be used to establish trust for different groups of individuals at different times, say, between remote developers prior to the release of software and by users after release. One question is whether when an R-B is used might be a significant factor. Another group of questions concerns the ways in which trust (of some sort) propagates among users of an R-B. There is an example in the paper of a company that rebuilds Debian reproducibly for security reasons and is then able to collaborate on software projects where software is built reproducibly with other companies that use public distributions of Debian.
Chris: Many thanks for this interview, Simon. If someone wanted to get in touch or learn more about you and your colleagues at the School of Informatics, where might they go? Thank you for the opportunity. It has been a pleasure to reflect a little more widely on the research! Personally, you can find out about my work on my official homepage and on my personal site. The software systems research group (SSRG) has a website, and the University of Sk vde s English language pages are also available. Chris: Many thanks for this interview, Simon!


For more information about the Reproducible Builds project, please see our website at reproducible-builds.org. If you are interested in ensuring the ongoing security of the software that underpins our civilisation and wish to sponsor the Reproducible Builds project, please reach out to the project by emailing contact@reproducible-builds.org.

30 July 2023

Russell Coker: Links July 2023

Phys.org has an interesting article about finding evidence for nanohertz gravity waves [1]. 1nano-Herz is a wavelength of 31.7 light years! Wired has an interesting story about OpenAI saying that no further advances will be made with larger training models [2]. Bruce Schneier and Nathan Sanders wrote an insightful article about the need for government run GPT type systems [3]. He focuses on the US, but having other countries/groups of countries do it would be good too. We could have a Chinese one, an EU one, etc. I don t think it would necessarily make sense for a small country like Australia to have one but it would make a lot more sense than having nuclear submarines (which are much more expensive). The Roadmap project is a guide for learning new technologies [4]. The content seems quite good. Bigthink has an informative and darkly amusing article Horror stories of cryonics: The gruesome fates of futurists hoping for immortality [5]. From this month in Australia psilocybin (active ingredient in Magic Mushrooms) can be prescribed for depression and MDMA (known as Ecstacy on the streets) can be prescribed for PTSD [6]. That s great news! Slate has an interesting article about the Operation Underground Railroad organisation that purports to help sex trafficed chilren [7]. This is noteworthy now with the controverst over the recent movie about that. Apparently they didn t provide much help for kids after they had been rescued and at least some of the kids were trafficed specifically to fulfill the demand that they created by offering to pay for it. Vigilantes aren t as effective as law enforcement. The ACCC is going to prevent Apple and Google from forcing app developers to give them a share of in-app purchases in Australia [8]. We need this in every country! This site has links to open source versions of proprietary games [9]. Vice has an interesting article about the Hungarian neuroscientist Viktor T th who taught rats to play Doom 2 [10]. The next logical step is to have mini tanks that they can use in real battlefields. Like the Mason s Rats episode of Love Death and Robots on Netflix. Brian Krebs wrote a mind boggling pair of blog posts about the Ashley Adison hack [11]. A Jewish disgruntled ex-employee sending anti-semitic harassment to the Jewish CEO and maybe cooperating with anti-semitic organisations to harass him is one of the people involved, but he killed himself (due to mental health problems) before the hack took place. Long Now has an insightful blog post about digital avatars being used after the death of the people they are based on [12]. Tavis Ormandy s description of the zenbleed bug is interesting [13]. The technique for finding the bug is interesting as well as the information on how the internals of the CPUs in question work. I don t think this means AMD is bad, trying to deliver increasing performance while limited by the laws of physics is difficult and mistakes are sometimes made. Let s hope the microcode updates are well distributed. The Hacktivist documentary about Andrew Bunnie Huang is really good [14]. Bunnie s lecture about supply chain attacks is worth watching [15]. Most descriptions of this issue don t give nearly as much information. However bad you thought this problem was, after you watch this lecture you will realise it s worse than that!

12 July 2023

Reproducible Builds: Reproducible Builds in June 2023

Welcome to the June 2023 report from the Reproducible Builds project In our reports, we outline the most important things that we have been up to over the past month. As always, if you are interested in contributing to the project, please visit our Contribute page on our website.


We are very happy to announce the upcoming Reproducible Builds Summit which set to take place from October 31st November 2nd 2023, in the vibrant city of Hamburg, Germany. Our summits are a unique gathering that brings together attendees from diverse projects, united by a shared vision of advancing the Reproducible Builds effort. During this enriching event, participants will have the opportunity to engage in discussions, establish connections and exchange ideas to drive progress in this vital field. Our aim is to create an inclusive space that fosters collaboration, innovation and problem-solving. We are thrilled to host the seventh edition of this exciting event, following the success of previous summits in various iconic locations around the world, including Venice, Marrakesh, Paris, Berlin and Athens. If you re interesting in joining us this year, please make sure to read the event page] which has more details about the event and location. (You may also be interested in attending PackagingCon 2023 held a few days before in Berlin.)
This month, Vagrant Cascadian will present at FOSSY 2023 on the topic of Breaking the Chains of Trusting Trust:
Corrupted build environments can deliver compromised cryptographically signed binaries. Several exploits in critical supply chains have been demonstrated in recent years, proving that this is not just theoretical. The most well secured build environments are still single points of failure when they fail. [ ] This talk will focus on the state of the art from several angles in related Free and Open Source Software projects, what works, current challenges and future plans for building trustworthy toolchains you do not need to trust.
Hosted by the Software Freedom Conservancy and taking place in Portland, Oregon, FOSSY aims to be a community-focused event: Whether you are a long time contributing member of a free software project, a recent graduate of a coding bootcamp or university, or just have an interest in the possibilities that free and open source software bring, FOSSY will have something for you . More information on the event is available on the FOSSY 2023 website, including the full programme schedule.
Marcel Fourn , Dominik Wermke, William Enck, Sascha Fahl and Yasemin Acar recently published an academic paper in the 44th IEEE Symposium on Security and Privacy titled It s like flossing your teeth: On the Importance and Challenges of Reproducible Builds for Software Supply Chain Security . The abstract reads as follows:
The 2020 Solarwinds attack was a tipping point that caused a heightened awareness about the security of the software supply chain and in particular the large amount of trust placed in build systems. Reproducible Builds (R-Bs) provide a strong foundation to build defenses for arbitrary attacks against build systems by ensuring that given the same source code, build environment, and build instructions, bitwise-identical artifacts are created.
However, in contrast to other papers that touch on some theoretical aspect of reproducible builds, the authors paper takes a different approach. Starting with the observation that much of the software industry believes R-Bs are too far out of reach for most projects and conjoining that with a goal of to help identify a path for R-Bs to become a commonplace property , the paper has a different methodology:
We conducted a series of 24 semi-structured expert interviews with participants from the Reproducible-Builds.org project, and iterated on our questions with the reproducible builds community. We identified a range of motivations that can encourage open source developers to strive for R-Bs, including indicators of quality, security benefits, and more efficient caching of artifacts. We identify experiences that help and hinder adoption, which heavily include communication with upstream projects. We conclude with recommendations on how to better integrate R-Bs with the efforts of the open source and free software community.
A PDF of the paper is now available, as is an entry on the CISPA Helmholtz Center for Information Security website and an entry under the TeamUSEC Human-Centered Security research group.
On our mailing list this month:
The antagonist is David Schwartz, who correctly says There are dozens of complex reasons why what seems to be the same sequence of operations might produce different end results, but goes on to say I totally disagree with your general viewpoint that compilers must provide for reproducability [sic]. Dwight Tovey and I (Larry Doolittle) argue for reproducible builds. I assert Any program especially a mission-critical program like a compiler that cannot reproduce a result at will is broken. Also it s commonplace to take a binary from the net, and check to see if it was trojaned by attempting to recreate it from source.

Lastly, there were a few changes to our website this month too, including Bernhard M. Wiedemann adding a simplified Rust example to our documentation about the SOURCE_DATE_EPOCH environment variable [ ], Chris Lamb made it easier to parse our summit announcement at a glance [ ], Mattia Rizzolo added the summit announcement at a glance [ ] itself [ ][ ][ ] and Rahul Bajaj added a taxonomy of variations in build environments [ ].

Distribution work 27 reviews of Debian packages were added, 40 were updated and 8 were removed this month adding to our knowledge about identified issues. A new randomness_in_documentation_generated_by_mkdocs toolchain issue was added by Chris Lamb [ ], and the deterministic flag on the paths_vary_due_to_usrmerge issue as we are not currently testing usrmerge issues [ ] issues.
Roland Clobus posted his 18th update of the status of reproducible Debian ISO images on our mailing list. Roland reported that all major desktops build reproducibly with bullseye, bookworm, trixie and sid , but he also mentioned amongst many changes that not only are the non-free images being built (and are reproducible) but that the live images are generated officially by Debian itself. [ ]
Jan-Benedict Glaw noticed a problem when building NetBSD for the VAX architecture. Noting that Reproducible builds [are] probably not as reproducible as we thought , Jan-Benedict goes on to describe that when two builds from different source directories won t produce the same result and adds various notes about sub-optimal handling of the CFLAGS environment variable. [ ]
F-Droid added 21 new reproducible apps in June, resulting in a new record of 145 reproducible apps in total. [ ]. (This page now sports missing data for March May 2023.) F-Droid contributors also reported an issue with broken resources in APKs making some builds unreproducible. [ ]
Bernhard M. Wiedemann published another monthly report about reproducibility within openSUSE

Upstream patches

Testing framework The Reproducible Builds project operates a comprehensive testing framework (available at tests.reproducible-builds.org) in order to check packages and other artifacts for reproducibility. In June, a number of changes were made by Holger Levsen, including:
  • Additions to a (relatively) new Documented Jenkins Maintenance (djm) script to automatically shrink a cache & save a backup of old data [ ], automatically split out previous months data from logfiles into specially-named files [ ], prevent concurrent remote logfile fetches by using a lock file [ ] and to add/remove various debugging statements [ ].
  • Updates to the automated system health checks to, for example, to correctly detect new kernel warnings due to a wording change [ ] and to explicitly observe which old/unused kernels should be removed [ ]. This was related to an improvement so that various kernel issues on Ubuntu-based nodes are automatically fixed. [ ]
Holger and Vagrant Cascadian updated all thirty-five hosts running Debian on the amd64, armhf, and i386 architectures to Debian bookworm, with the exception of the Jenkins host itself which will be upgraded after the release of Debian 12.1. In addition, Mattia Rizzolo updated the email configuration for the @reproducible-builds.org domain to correctly accept incoming mails from jenkins.debian.net [ ] as well as to set up DomainKeys Identified Mail (DKIM) signing [ ]. And working together with Holger, Mattia also updated the Jenkins configuration to start testing Debian trixie which resulted in stopped testing Debian buster. And, finally, Jan-Benedict Glaw contributed patches for improved NetBSD testing.

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

7 July 2023

Dirk Eddelbuettel: Rcpp 1.0.11 on CRAN: Updates and Maintenance

rcpp logo The Rcpp Core Team is delighted to announce that the newest release 1.0.11 of the Rcpp package arrived on CRAN and in Debian earlier today. Windows and macOS builds should appear at CRAN in the next few days, as will builds in different Linux distribution and of course at r2u. The release was finalized three days ago, but given the widespread use and extended reverse dependencies at CRAN it usually takes a few days to be processed. This release continues with the six-months January-July cycle started with release 1.0.5 in July 2020. As a reminder, we do of course make interim snapshot dev or rc releases available via the Rcpp drat repo and strongly encourage their use and testing I run my systems with these versions which tend to work just as well, and are also fully tested against all reverse-dependencies. Rcpp has long established itself as the most popular way of enhancing R with C or C++ code. Right now, 2720 packages on CRAN depend on Rcpp for making analytical code go faster and further, along with 251 in BioConductor. On CRAN, 13.7% of all packages depend (directly) on Rcpp, and 59.6% of all compiled packages do. From the cloud mirror of CRAN (which is but a subset of all CRAN downloads), Rcpp has been downloaded 72.5 million times. The two published papers (also included in the package as preprint vignettes) have, respectively, 1678 (JSS, 2011) and 259 (TAS, 2018) citations, while the the book (Springer useR!, 2013) has another 588. This release is incremental as usual, generally preserving existing capabilities faithfully while smoothing our corners and / or extending slightly, sometimes in response to changing and tightened demands from CRAN or R standards. The full list below details all changes, their respective PRs and, if applicable, issue tickets. Big thanks from all of us to all contributors!

Changes in Rcpp version 1.0.11 (2023-07-03)
  • Changes in Rcpp API:
    • Rcpp:::CxxFlags() now quotes only non-standard include path on linux (Lukasz in #1243 closing #1242).
    • Two unit tests no longer accidentally bark on stdout (Dirk and I aki in #1245).
    • Compilation under C++20 using clang++ and its standard library is enabled (Dirk in #1248 closing #1244).
    • Use backticks in a generated .Call() statement in RcppExports.R (Dirk #1256 closing #1255).
    • Switch to system2() to capture standard error messages in error cases (I aki in #1259 and #1261 fixing #1257).
  • Changes in Rcpp Documentation:
    • The CITATION file format has been updated (Dirk in #1250 fixing #1249).
  • Changes in Rcpp Deployment:
    • A test for qnorm now uses the more accurate value from R 4.3.0 (Dirk in #1252 and #1260 fixing #1251).
    • Skip tests with path issues on Windows (I aki in #1258).
    • Container deployment in continuous integrations was improved. (I aki and Dirk in #1264, Dirk in #1269).
    • Several files receives minor edits to please R CMD check from r-devel (Dirk in #1267).

Thanks to my CRANberries, you can also look at a diff to the previous release. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page. Bugs reports are welcome at the GitHub issue tracker as well (where one can also search among open or closed issues); questions are also welcome under rcpp tag at StackOverflow which also allows searching among the (currently) 2994 previous questions. If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

Next.

Previous.